Message boards :
Number crunching :
Public beta for nVidia AstroPulse, rev 521
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 30 · Next
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks. Is it because of the platform and plan class that makes the scheduler ask specifically for AP GPU WU's? The regular AK_V8 AP section asks for AP CPU work? Just trying to get a handle on how the BOINC Manager decides who gets to crunch the data. I have 4 CPU AP WU's already on board and don't want to upset the apple cart just yet. May take a while to get some work as the project seems to be unreachable currently and no work built up from the splitters. Hope that once the floodgates open on Wednesday that I'll get a chance to try out the new application. Cheer, Keith Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
When someone finds the correct setup for a GTX 470 can you please post here or pm me. I hate the idea of trashing good work Thanks |
halfempty Send message Joined: 2 Jun 99 Posts: 97 Credit: 35,236,901 RAC: 114 |
I thought about what you said about unroll set to 8 so I decided to try it. The AP task still ran about the same but I noticed the MB task running with it on my GTS 450 suddenly slowed way down. When I checked it's time to completion guesstimate was double what it had been. I stopped and went back up to 10 and now the MB task is running much faster again. Thanks for that info. From a previous release of the OpenCL AP app I thought I remembered that it being a multiple of something on the card was the most efficient. Searched through and tracked down a post mentioning it. http://setiathome.berkeley.edu/forum_thread.php?id=62738&nowrap=true#1068112 I only run 1 wu at a time so I won't see the degradation you saw, but I'll have to play around to check if it makes much difference on this card. On my winter card, HD5850, I read the number of compute units from stdout and halved it for my unroll. Was doing unblanked AP in about 68 minutes. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
From maximising GPU compute units load unroll factor should be multiple of CUs of GPU. But sometimes it can lead to weird memory layout that will cause more slowdown (not all memory channels are in use at specific strides). Then performance hit will be bigger than benefit. So some experimentation (or strong understanding of GPU architecture for particualr GPU in use) is required with unroll factor. For now I have statistic for HD6950 - unroll of 16 only slightly better (on real world tasks) than unroll of 6. For GSO9600 I have no statistics. Only know that unroll of 12 caused invalid overflows whil unroll of 10 works OK. It takes lot of time to collect reliable data set... |
CryptokiD Send message Joined: 2 Dec 00 Posts: 150 Credit: 3,216,632 RAC: 0 |
i have heard somwhere that running astropulse does not give as high of a rac as multibeam does for cpu. does anyone know if this is true for nvidia gpu as well? |
Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572 |
When someone finds the correct setup for a GTX 470 can you please post here or pm me. I hate the idea of trashing good work A post here would be good. Kevin |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
i have heard somwhere that running astropulse does not give as high of a rac as multibeam does for cpu. does anyone know if this is true for nvidia gpu as well? I can almost double my RAC running astropulse only on my HD 5850. But its hard to get enough work atm. With each crime and every kindness we birth our future. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
i have heard somwhere that running astropulse does not give as high of a rac as multibeam does for cpu. does anyone know if this is true for nvidia gpu as well? That is probably true on a lot of AMD CPU's, since the CPU Optimised AP app is quite cache intensive, Intel CPU's should still give a similar amount of Credit/Sec on MB and AP Wu's (subject to what NewCredit gives out) On ATI GPU's the ATI OpenCL AP app is a lot faster than the MB app, my HD5770 can only do 3 Normal MB Wu's in the time a 0% Blanked AP Wu takes, On Nvidia GPU's, it's probably quite close, my GTX460 takes 50 minutes per 0% Blanked AP task, a Normal AR MB Wu takes ~10 minutes, so 5 MB Wu's at 120 Credits makes ~600Credits, and that AP Wu should make 600 to 800 Credits (after doing 10 validations first, and subject to NewCredit) Claggy |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
On the other end of the spectrum (i.e. very low end), I've been running <cmdline>-ffa_block 2048 -ffa_block_fetch 1024 -unroll 4</cmdline> I can probably go somewhat higher, but if tasks take some 16 hours on a host running about 8 hours a day for 5 days a week, any testing takes a lot of patience ;) Carola ------- I'm multilingual - I can misunderstand people in several languages! |
Highlander Send message Joined: 5 Oct 99 Posts: 167 Credit: 37,987,668 RAC: 16 |
Have it installed it with your given app-info infos ... only change: have delete the switch -hp. I replaced the whole ap cpu section with the GPU version. Till now, these are my tasks: http://setiathome.berkeley.edu/result.php?resultid=2000852972 http://setiathome.berkeley.edu/result.php?resultid=2000647981 http://setiathome.berkeley.edu/result.php?resultid=2000647979 http://setiathome.berkeley.edu/result.php?resultid=2000647974 2) 2 hosts already reported greatly increased CPU consumption when running with 27x.xx drivers. yes, indeed, have high cpu usage, with 0 percent blanked about 98 % of an HT-Core. Its less, if more is blanked (was about 85 % cpu usage at the 4.x blanked). But untill now to less time (or WUs) to do more testing with the parameters. I personally would know, what happens on my system, when no cpu-lock is active. Greetings and thanks again for the great work. Chris - Performance is not a simple linear function of the number of CPUs you throw at the problem. - |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Have it installed it with your given app-info infos ... only change: have delete the switch -hp. I replaced the whole ap cpu section with the GPU version. Yeah, too high CPU consumption with 275.xx driver. Could you downgrade to 267.xx and check CPU usage there? W/o CPUlock there will be no affinity setted and different GPU taksk would compete for the same CPU. |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
Finished first AP http://setiathome.berkeley.edu/result.php?resultid=2000735263 and already validated with 656.27 CR. It took 56m16s, not bad I think. It used 13% CPU, seems much, due to using 275.33 drivers? |
Jamie Send message Joined: 5 Apr 06 Posts: 162 Credit: 9,867,955 RAC: 0 |
Possible, there does seem to be a few more people seeing the high CPU usage, try the 267.xx drivers if you get chance. I'm running them and see 3-6% CPU usage |
S@NL - eFMer - efmer.com/boinc Send message Joined: 7 Jun 99 Posts: 512 Credit: 148,746,305 RAC: 0 |
Just slightly off topic. Is there anyone with both nVidia and ATI Astropulse or is that impossible. TThrottle Control your temperatures. BoincTasks The best way to view BOINC. Anza Borrego Desert hiking. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Finished first AP http://setiathome.berkeley.edu/result.php?resultid=2000735263 and already validated with 656.27 CR. I would say it uses all CPU (logical CPU, not whole CPUs in your system). Elapsed time almost equal CPU time... Ð’Ñ€ÐµÐ¼Ñ Ð²Ñ‹Ð¿Ð¾Ð»Ð½ÐµÐ½Ð¸Ñ 3,376.00 Ð’Ñ€ÐµÐ¼Ñ Ð¦ÐŸ 3,372.34 And yes, most probably it's because of 275.xx drivers... downgrade needed. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Just slightly off topic. Claggy and Ghost do this from early alpha. Yes, it's possible. See second (Claggy's ) post in this thread how to properly configure both apps. |
Highlander Send message Joined: 5 Oct 99 Posts: 167 Credit: 37,987,668 RAC: 16 |
So here is one with 266.58 driver version (have some problems/issues with 267.x): http://setiathome.berkeley.edu/result.php?resultid=2001207330 CPU usage is about a third of one HT Core, so at this front much better. But im switching back to 275.33 cause the x38g MB runs _much_ better with that version. FYI (if it is of use at all?): - Performance is not a simple linear function of the number of CPUs you throw at the problem. - |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
What you mean by this ? Faster, smoother? No lags? Little CPU consumption? What is better ? |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
2nd AP finished http://setiathome.berkeley.edu/result.php?resultid=2000758397 and was validated against a ATI wingman which was faster, only 284.76 CR though :) Used 267.24 drivers and it took 45m22s. It used 0-7% CPU. |
Highlander Send message Joined: 5 Oct 99 Posts: 167 Credit: 37,987,668 RAC: 16 |
What you mean by this ? Faster, smoother? No lags? Little CPU consumption? What is better ? Till now, i had no issues with lags or similar effects on my comps with the drivers and app-versions i had chosen (knock, knock, knock). Its only the speed, which is imo much better. As reference, i take my phenom computer with xp and a stock clocked GTX 260 216 -> middle execution times of 12-13 minutes per MB WU. With the new x38g, the times changed only slightly to the better (only seconds) on my i7 with W7 64 bit and GTX 460 OC, the times changed from about 12-15 minutes / MB WU (with x32f and 266.58) to 9-11 minutes with x38g and 275.33. Think, thats a huge step forward, and the first time, that the GTX 460 is faster than my GTX260. I know, that the W7 Driver-Model is a burdon, and i'm now happy that the new driver/app/GPU/OS combination can show it's potential. And if the nV AP rev 521 uses more cpu time at the moment, i can live with that. Imho the AP WUs are so rare that the additional cpu time dont count up that much. And till now i havnt played with the various setting of the AP app. But one step after another, i have time :-) - Performance is not a simple linear function of the number of CPUs you throw at the problem. - |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.