AstroPulse v7 v7.10 (opencl_nvidia_100)

Message boards : Number crunching : AstroPulse v7 v7.10 (opencl_nvidia_100)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1671842 - Posted: 30 Apr 2015, 5:39:22 UTC

I finally got some AP tasks today and have been trying to figure out what the best CPU/GPU usage is best for them. I know previous AP GPU tasks were CPU intensive so I shut down a CPU if 1 of 2 tasks running were AP. No problem there.

But I must say ... holly crap, the CPU usage is nuts with v7.10! It's taking a complete core (100% use of 2.9GHz) to feed each of 2 tasks to a 750ti. So that's 2 full cores out of 4.

On my XP 4200+ 2 core, it's impossible to run 2 GPU + 1 CPU task. The main reason is v7.10 is set at 'below normal' priority, and CPU tasks are all 'low' priority. So 2 GPU tasks completely choke out any CPU task (0-2% use), and there is no way for them to run. My CPU tasks would all sit idle until I run out of AP GPU tasks. And likely time out. So I lowered it to 1 task only if AP is running. Not in favor of that but it's all I can do.

Is it bad programming or what that v7.10 needs that much CPU to feed the GPU? And why is the priority set higher than CPU tasks?
ID: 1671842 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1671850 - Posted: 30 Apr 2015, 5:55:48 UTC - in response to Message 1671842.  

Are you using any commandlines for the APs?

I know I had discussed this before about CPU usage of the new APs.
ID: 1671850 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1671855 - Posted: 30 Apr 2015, 6:04:13 UTC - in response to Message 1671850.  

Current config for my i5, same command line for MB and AP. GPU 750ti.


<app_config>

<app>
<name>setiathome_v7</name>
<cmdline>-use_sleep -unroll 16 -oclfft_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 -hp</cmdline>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.4</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<cmdline>-use_sleep -unroll 16 -oclfft_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 -hp</cmdline>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.6</cpu_usage>
</gpu_versions>
</app>

</app_config>

ID: 1671855 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1671877 - Posted: 30 Apr 2015, 7:03:35 UTC - in response to Message 1671855.  

For a test I completely removed the command line for AP (yes I waited for new tasks to start)

Running 2 tasks on 750ti i5 2.9Ghz CPU. It still uses 2 full cores at 100% usage to feed 2 GPU tasks.

That is just crazy CPU usage for a stock config!
ID: 1671877 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1671894 - Posted: 30 Apr 2015, 7:33:56 UTC - in response to Message 1671877.  
Last modified: 30 Apr 2015, 7:44:31 UTC

It's not the OpenCL app for NVidia GPUs.

AFAIK, it's the decision of NVidia to make their drivers buggy if you let run OpenCL apps (ATI/AMD's 'kingdom').
The latest good driver for OpenCL apps was 263.06 for (just pre-Fermi?)* NV GPUs.

Because of this you 'must' use the '-use_sleep' at least in cmdline, for to let be freeing your CPU.

BTW. The SETIv7 app for NVidia GPUs, it's the CUDA app, don't have (work with) this kind of settings which you use in your app_config.xml file.

[* this was in my case, WinXP x86 and GTX285]
ID: 1671894 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1671899 - Posted: 30 Apr 2015, 7:43:53 UTC - in response to Message 1671894.  
Last modified: 30 Apr 2015, 7:44:58 UTC

BTW, I forgot, the priority of the CUDA or OpenCL GPU app should be higher than for the CPU apps.
Because, normally the GPU have more performance, than the installed CPU. So it's good to get the GPU more support of the CPU.
ID: 1671899 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1671905 - Posted: 30 Apr 2015, 8:34:16 UTC - in response to Message 1671855.  
Last modified: 30 Apr 2015, 8:49:55 UTC

Current config for my i5, same command line for MB and AP. GPU 750ti.


<app_config>

<app>
<name>setiathome_v7</name>
<cmdline>-use_sleep -unroll 16 -oclfft_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 -hp</cmdline>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.4</cpu_usage>
</gpu_versions>
</app>

<app>
<name>astropulse_v7</name>
<cmdline>-use_sleep -unroll 16 -oclfft_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 4 1 -tune 2 64 4 1 -hp</cmdline>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.6</cpu_usage>
</gpu_versions>
</app>

</app_config>

It might be a good idea to read the manuals about now.
Most people place those command lines in different files named similar to;
ap_cmdline_win_x86_SSE2_OpenCL_NV.txt
and
mb_cmdline_win_x86_SSE2_OpenCL_ATI.txt
Or
mbcuda.cfg for CUDA MB

AP ReadMe, http://lunatics.kwsn.info/downloads/v0.43a_ReadMe_AstroPulse_OpenCL_NV.txt
<app_config> Instructions, http://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

There also should be ReadMes in your setiathome.berkeley.edu folder.
ID: 1671905 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1671911 - Posted: 30 Apr 2015, 9:03:39 UTC - in response to Message 1671905.  

It might be a good idea to read the manuals about now.
Most people place those command lines in different files named similar to;
ap_cmdline_win_x86_SSE2_OpenCL_NV.txt
and
mb_cmdline_win_x86_SSE2_OpenCL_ATI.txt
Or
mbcuda.cfg for CUDA MB

AP ReadMe, http://lunatics.kwsn.info/downloads/v0.43a_ReadMe_AstroPulse_OpenCL_NV.txt
<app_config> Instructions, http://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

There also should be ReadMes in your setiathome.berkeley.edu folder.

But as Brent Norman said, a stock app should run 'adequately' (my word, not his) straight out of the box, like any other BOINC application.

All this messing around with ReadMes in what would normally be hidden data folders, manual configuration of complex command lines etc. etc. betrays the application's origin as "for advanced users only" - geeks and enthusiasts. It would be good to concentrate for a while on the 'stockness' of the applications, and try to find a default distribution model which allows it to run with the minimum of interference with either the user or with other projects' BOINC (CPU) applications.
ID: 1671911 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1671915 - Posted: 30 Apr 2015, 9:13:41 UTC - in response to Message 1671911.  

It might be a good idea to read the manuals about now.
Most people place those command lines in different files named similar to;
ap_cmdline_win_x86_SSE2_OpenCL_NV.txt
and
mb_cmdline_win_x86_SSE2_OpenCL_ATI.txt
Or
mbcuda.cfg for CUDA MB

AP ReadMe, http://lunatics.kwsn.info/downloads/v0.43a_ReadMe_AstroPulse_OpenCL_NV.txt
<app_config> Instructions, http://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

There also should be ReadMes in your setiathome.berkeley.edu folder.

But as Brent Norman said, a stock app should run 'adequately' (my word, not his) straight out of the box, like any other BOINC application.

All this messing around with ReadMes in what would normally be hidden data folders, manual configuration of complex command lines etc. etc. betrays the application's origin as "for advanced users only" - geeks and enthusiasts. It would be good to concentrate for a while on the 'stockness' of the applications, and try to find a default distribution model which allows it to run with the minimum of interference with either the user or with other projects' BOINC (CPU) applications.

As soon as Brent Norman added a flawed <app_config> to his 'hidden folder' it became something other than 'stock'. If you are going to add something to 'stock', you Should read the manual first, or at least after it doesn't work as expected. If you want 'stock', then don't complain about a nVidia driver 'feature' that has existed since Driver 266.58.
ID: 1671915 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1671917 - Posted: 30 Apr 2015, 9:16:43 UTC - in response to Message 1671911.  

Yes I understand some but not all of the configs, this command line was recommended to me for this card. And the options make sense to me for what I'm running.

I'm at a command line of "-use_sleep" and nothing more. CPU usage hasn't changed from a full command line, to nothing, to this one.

And thanks Richard for understanding what I'm saying.

My main point is ... Why does the app need a full core of 2.9Ghz to feed 1 task? That is just silly. Sure MBs take 2-5%, AP is more aggressive so 20-25% would be OK, but a full CPU?

I would bet if I had 8Ghz, it would take all that too for 1 GPU task.
ID: 1671917 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1671924 - Posted: 30 Apr 2015, 9:31:17 UTC - in response to Message 1671915.  

I think we should leave the MB / AP comparison out of this discussion for the time being - that comes from using different underlying programming languages for the two applications.

If you want 'stock', then don't complain about a nVidia driver 'feature' that has existed since Driver 266.58.

And I have never seen an adequate exploration of why that seems to have happened, and what steps have been taken to understand and mitigate it.

I'm not thinking specifically of AP, or SETI, or BOINC, here. OpenCL for GPUs is a general-purpose tool. There must be a wider user community, developing other applications for other uses. Is this CPU usage since 266.58 universal across all developments? Does it apply to both Windows XP and the Windows Vista/7/8 ranges, which use very different underlying driver models? Somebody, somewhere, must have asked themselves those questions, and come up with some (at least speculative) answers. Where are they - can anyone link?
ID: 1671924 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1671937 - Posted: 30 Apr 2015, 10:05:20 UTC - in response to Message 1671924.  
Last modified: 30 Apr 2015, 10:06:48 UTC

Well, Raistmer has explained it a few times here, as you are aware. He appears to be lurking about, so I'll let him explain it again. I have run across other people complaining about the CPU use in other locations. Here's a few posts about it, this one sounds related; Increased CPU usage with last drivers starting from...
Oh my, it's Raistmer...

Why does FireFox keep reverting to the Yahoo search engine???
ID: 1671937 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1671938 - Posted: 30 Apr 2015, 10:13:53 UTC - in response to Message 1671937.  

Well, Raistmer has explained it a few times here, as you are aware. He appears to be lurking about, so I'll let him explain it again. I have run across other people complaining about the CPU use in other locations. Here's a few posts about it, this one sounds related; Increased CPU usage with last drivers starting from...
Oh my, it's Raistmer...

That's the problem. Only Raistmer seems to be experiencing/talking about it.

And using the word "bug" in your search term is pre-determining the outcome. I'm seeking a more fundamental understanding, open to the possibility that a new 'feature' (in the true sense) might require an update to previously-established programming techniques.
ID: 1671938 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1671966 - Posted: 30 Apr 2015, 11:51:54 UTC - in response to Message 1671938.  
Last modified: 30 Apr 2015, 11:56:07 UTC

No comments.
ID: 1671966 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1671970 - Posted: 30 Apr 2015, 12:03:02 UTC - in response to Message 1671966.  

No comments.


Innit.


With each crime and every kindness we birth our future.
ID: 1671970 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1672009 - Posted: 30 Apr 2015, 13:27:19 UTC - in response to Message 1671966.  

No comments.

OK, I can see how you may have read it differently than intended. I'll rephrase it.
When I searched for 'bugs' the 4th post down was "Increased CPU usage with last drivers starting from...". I clicked on it but didn't read the author. I thought it would be a good link and made the post. Later I went back and read it and saw how it was your post from 3 years ago. What I should have said was, Oh My, it's Raistmer from three years ago trying to get nVidia to suggest a Fix for this problem that a number of different Developers are having...

Happy?
;-)
ID: 1672009 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1672063 - Posted: 30 Apr 2015, 15:58:49 UTC - in response to Message 1671924.  

I think we should leave the MB / AP comparison out of this discussion for the time being - that comes from using different underlying programming languages for the two applications.

If you want 'stock', then don't complain about a nVidia driver 'feature' that has existed since Driver 266.58.

And I have never seen an adequate exploration of why that seems to have happened, and what steps have been taken to understand and mitigate it.

I'm not thinking specifically of AP, or SETI, or BOINC, here. OpenCL for GPUs is a general-purpose tool. There must be a wider user community, developing other applications for other uses. Is this CPU usage since 266.58 universal across all developments? Does it apply to both Windows XP and the Windows Vista/7/8 ranges, which use very different underlying driver models? Somebody, somewhere, must have asked themselves those questions, and come up with some (at least speculative) answers. Where are they - can anyone link?

IMO, NVIDIA simply chose to assume that those running OpenCL apps on their GPUs would want the ultimate possible performance. So if OpenCL enqueues another kernel before a previous kernel is complete their implementation uses something like a CPU spin loop to provide the least possible delay in getting the new kernel started.

With -use_sleep Raistmer's apps delay enqueueing a new kernel until after the previous kernel is done (for the most time-consuming kernels), thereby avoiding most of that extra CPU "usage". But the sleep doesn't end just at the right time so there's added latency, adding some tuning can minimize that.

Brent's problem is that although <cmdline> can be used within an app_config.xml it is only valid within an <app_version> section. I'd be guessing if I tried to suggest the exact content of that <app_version> section, so I won't.
                                                                  Joe
ID: 1672063 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1672095 - Posted: 30 Apr 2015, 16:36:58 UTC - in response to Message 1672063.  

Brent's problem is that although <cmdline> can be used within an app_config.xml it is only valid within an <app_version> section. I'd be guessing if I tried to suggest the exact content of that <app_version> section, so I won't.
                                                                  Joe

It's documented at http://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

Most, except <app_name>, is optional, and <plan_class> can be copied from the apps page, or re-typed from the BOINC Manager display.
ID: 1672095 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1672103 - Posted: 30 Apr 2015, 16:45:36 UTC - in response to Message 1672063.  
Last modified: 30 Apr 2015, 16:57:49 UTC

I think we should leave the MB / AP comparison out of this discussion for the time being - that comes from using different underlying programming languages for the two applications.

If you want 'stock', then don't complain about a nVidia driver 'feature' that has existed since Driver 266.58.

And I have never seen an adequate exploration of why that seems to have happened, and what steps have been taken to understand and mitigate it.

I'm not thinking specifically of AP, or SETI, or BOINC, here. OpenCL for GPUs is a general-purpose tool. There must be a wider user community, developing other applications for other uses. Is this CPU usage since 266.58 universal across all developments? Does it apply to both Windows XP and the Windows Vista/7/8 ranges, which use very different underlying driver models? Somebody, somewhere, must have asked themselves those questions, and come up with some (at least speculative) answers. Where are they - can anyone link?

IMO, NVIDIA simply chose to assume that those running OpenCL apps on their GPUs would want the ultimate possible performance. So if OpenCL enqueues another kernel before a previous kernel is complete their implementation uses something like a CPU spin loop to provide the least possible delay in getting the new kernel started.

With -use_sleep Raistmer's apps delay enqueueing a new kernel until after the previous kernel is done (for the most time-consuming kernels), thereby avoiding most of that extra CPU "usage". But the sleep doesn't end just at the right time so there's added latency, adding some tuning can minimize that.

...
                                                                  Joe


That's the closest so far to what I'd expect the real choices/decisions might have involved, considering that kindof decision-making goes on behind closed doors.

A subtle irony in all this, is that I've been working toward getting rid of 'old-school-cuda-blocking-synch' in the MB Cuda application, because frankly it's not the most efficient way to go either. Unsurprisingly, the cheapest/most-efficient synchronisation methods seem to be turning out to be graphics-api like ones, such as precision multimedia timer-based or frame renderloop based ones.

It's easy to forget these devices evolved from, and are, graphics devices, drivers and apis. So expecting them to behave otherwise as 'pure abstract opencl virtual machines' is probably a little unrealistic.

With something as low level as OpenCL, as with Cuda driver api, you're expected to be able to 'roll your own', or use a higher level api & libraries.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1672103 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1672831 - Posted: 1 May 2015, 23:43:51 UTC

I find this interesting...

2 computers
XP 4200+ 2 core
8,1 2.9GHz 4 core

Both with 750ti running identical configs, 1 AP task on GPU, same command line.

As expected they both took over 1 core for the GPU task with v7.10

But, run times were also identical. So it leads me to believe that v7.10 is hungry for CPU time and not processing power.

For the hell of it I tried 2AP 7.10 + 1 MB Cuda50 on my 4200+ 2core, and the 2 AP tasks completely took over the CPU, and choked out the Cuda50 task along with the CPU.

Even though the Cuda50 task had the same CPU priority as the v7.10 tasks, it could not get any processing time to run.

Something is wrong with v7.10 I say.
ID: 1672831 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : AstroPulse v7 v7.10 (opencl_nvidia_100)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.