Message boards :
Number crunching :
VLARs now also sent to CUDA app?
Message board moderation
Author | Message |
---|---|
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
Hi there, i recently got a VLAR wu on the CUDA 5.0 app: http://setiathome.berkeley.edu/result.php?resultid=4719989683 Since it will be gone in a few hours, here the stdout.txt: Name 15oc15ab.6106.10701.8.35.245_1 Workunit 2059707558 Created 11 Feb 2016, 9:14:20 UTC Sent 11 Feb 2016, 12:12:03 UTC Report deadline 2 Apr 2016, 12:01:09 UTC Received 15 Feb 2016, 0:41:20 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 157931 Run time 5 hours 12 min 37 sec CPU time 6 min 5 sec Validate state Valid Credit 175.29 Device peak FLOPS 692.35 GFLOPS Application version SETI@home v8 Anonymous platform (NVIDIA GPU) Peak working set size 91.43 MB Peak swap size 127.82 MB Peak disk usage 0.03 MB Stderr output <core_client_version>7.6.22</core_client_version> <![CDATA[ <stderr_txt> v8 task detected setiathome_CUDA: Found 2 CUDA device(s): nVidia Driver Version 361.75 Device 1: GeForce GT 640, 2048 MiB, regsPerBlock 65536 computeCap 3.0, multiProcs 2 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GT 430, 512 MiB, regsPerBlock 32768 computeCap 2.1, multiProcs 2 pciBusID = 5, pciSlotID = 0 clockRate = 1400 MHz In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce GT 640 is okay SETI@home using CUDA accelerated device GeForce GT 640 mbcuda.cfg, matching pci device processpriority key detected mbcuda.cfg, matching pci device pfblockspersm key detected pulsefind: blocks per SM 2 mbcuda.cfg, matching pci device pfperiodsperlaunch key detected pulsefind: periods per launch 100 (default) Priority of process set to BELOW_NORMAL (default) successfully Priority of worker thread set successfully setiathome enhanced x41zi (baseline v8), Cuda 5.00 setiathome_v8 task detected Detected Autocorrelations as enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.080614 GPU current clockRate = 901 MHz re-using dev_GaussFitResults array for dev_AutoCorrIn, 4194304 bytes re-using dev_GaussFitResults+524288x8 array for dev_AutoCorrOut, 4194304 bytes Thread call stack limit is: 1k Exit Status: 0 boinc_exit(): requesting safe worker shutdown -> Worker Acknowledging exit request, spinning-> boinc_exit(): received safe worker shutdown acknowledge -> Cuda threadsafe ExitProcess() initiated, rval 0 v8 task detected setiathome_CUDA: Found 2 CUDA device(s): nVidia Driver Version 361.75 Device 1: GeForce GT 640, 2048 MiB, regsPerBlock 65536 computeCap 3.0, multiProcs 2 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GT 430, 512 MiB, regsPerBlock 32768 computeCap 2.1, multiProcs 2 pciBusID = 5, pciSlotID = 0 clockRate = 1400 MHz In cudaAcc_initializeDevice(): Boinc passed DevPref 2 setiathome_CUDA: CUDA Device 2 specified, checking... Device 2: GeForce GT 430 is okay SETI@home using CUDA accelerated device GeForce GT 430 mbcuda.cfg, matching pci device processpriority key detected mbcuda.cfg, matching pci device pfblockspersm key detected pulsefind: blocks per SM 4 (Fermi or newer default) mbcuda.cfg, matching pci device pfperiodsperlaunch key detected pulsefind: periods per launch 200 Priority of process set to BELOW_NORMAL (default) successfully Priority of worker thread set successfully setiathome enhanced x41zi (baseline v8), Cuda 5.00 setiathome_v8 task detected Detected Autocorrelations as enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.080614 re-using dev_GaussFitResults array for dev_AutoCorrIn, 4194304 bytes re-using dev_GaussFitResults+524288x8 array for dev_AutoCorrOut, 4194304 bytes Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... cudaAcc_free() DONE. Flopcounter: 63430621422201.633000 Spike count: 1 Autocorr count: 0 Pulse count: 7 Triplet count: 0 Gaussian count: 0 Worker preemptively acknowledging a normal exit.-> called boinc_finish Exit Status: 0 boinc_exit(): requesting safe worker shutdown -> boinc_exit(): received safe worker shutdown acknowledge -> Cuda threadsafe ExitProcess() initiated, rval 0 </stderr_txt> ]]> Have a look at the AR: "WU true angle range is : 0.080614" This WU ran nearly "forever" compared to other WUs and i had a lot of lags while this was running. Is this a new adjustment to the server routine? Aloha, Uli |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Have a look at the AR: Yes, that matches the discussion and conclusion we reached a week ago. A small arithmetic error in preparing the splitters to handle Green Bank (and other) telescope data shifted the threshhold for definition as a VLAR down by 50% - from 0.12 down to 0.06 (see Panic Mode On (102) Server Problems?) Eric did reply "I'll fix it during this week's outage", but evidently something intervened and it dropped off the ToDo list. I'll remind him tomorrow, closer to the likely timeframe for action. (Today is a Federal Holiday - Presidents' Day - in the USA, so no point in writing today.) |
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
Richard, thanks for the info! I must have overlooked that in the panic thread. Aloha, Uli |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Yes they sure are slow: 0.06 in 14 minutes. Name 14oc11af.23215.148572.15.42.51_0 Workunit 2059161656 Created 10 Feb 2016, 21:26:02 UTC Sent 11 Feb 2016, 1:53:31 UTC Report deadline 4 Apr 2016, 21:29:56 UTC Received 11 Feb 2016, 4:59:36 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 7475713 Run time 14 min 5 sec CPU time 2 min Validate state Valid Credit 105.25 Device peak FLOPS 7,698.43 GFLOPS Application version SETI@home v8 Anonymous platform (NVIDIA GPU) Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 2, pciSlotID = 0 Device 3: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 4 setiathome_CUDA: CUDA Device 4 specified, checking... Device 4: GeForce GTX 780 is okay SETI@home using CUDA accelerated device GeForce GTX 780 Using pfb = 64 from command line args Using pfp = 3 from command line args setiathome v8 enhanced x41p_zm, Cuda 7.50 special Compiled with NVCC 7.5, using 6.5 libraries. Modifications done by petri33. Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.068821 Sigma 20 Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 Flopcounter: 54084954807375.851562 Spike count: 4 Autocorr count: 0 Pulse count: 14 Triplet count: 4 Gaussian count: 0 06:55:48 (1116): called boinc_finish(0) </stderr_txt> ]]> To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
The reason for blocking VLARs from NVidia GPUs was that they caused significant system issues (screen lag, system becomes very sluggish/unresponsive). Other than taking a long time to crunch (sometimes longer than the estimated times) my systems aren't showing any signs of sluggishness or lack of responsiveness. So personally I wouldn't have any issue with them as long as the credit given reflected the work done. Unfortunately that doesn't appear to be the case. 1,455.32 secs 66.27 credits 1,486.28 secs 101.89 credits 1,645.92 secs 123.29 credits 1,716.61 secs 77.68 credits 2,006.53 secs 91.27 credits 2,066.25 secs 88.43 credits 2,080.84 secs 109.99 credits 2,204.44 secs 103.19 credits Grant Darwin NT |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
That's most definitely the very next challenge, as soon as the 3 main platforms, and possibly a fourth, Jetson TK1 I hear by PM today someone is working on :), come into lockstep. From there we basically make Petri style streaming + CPU use + some other special optimisations configurable to how a user wants to run (with special tools to help decide how best to do so, minimal nerdiness required) Big long drawn out process for me, but I think worth it in the long run, over churning out builds with compromises. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
The pay is not always that bad. But compared to 40 points in 70 seconds for shoties there is still something. This is a 0.07 in 11 minutes (227 credits). Name 19mr11ad.15949.4984.13.40.98_1 Workunit 2065134343 Created 16 Feb 2016, 4:44:32 UTC Sent 16 Feb 2016, 9:40:38 UTC Report deadline 9 Apr 2016, 19:38:48 UTC Received 16 Feb 2016, 14:45:14 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 7475713 Run time 11 min 12 sec CPU time 2 min 8 sec Validate state Valid Credit 227.85 Device peak FLOPS 7,698.43 GFLOPS Application version SETI@home v8 Anonymous platform (NVIDIA GPU) Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 2, pciSlotID = 0 Device 3: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce GTX 980 is okay SETI@home using CUDA accelerated device GeForce GTX 980 Using pfb = 64 from command line args Using pfp = 3 from command line args setiathome v8 enhanced x41p_zm, Cuda 7.50 special Compiled with NVCC 7.5, using 6.5 libraries. Modifications done by petri33. Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.070283 Sigma 19 Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 Flopcounter: 53215262177658.898438 Spike count: 4 Autocorr count: 1 Pulse count: 7 Triplet count: 2 Gaussian count: 0 16:44:55 (22723): called boinc_finish(0) </stderr_txt> ]]> To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
And the kitties continue to crunch whatever is sent to them with the best apps available to them. Without complaint. Meow. Although they do send the optimizers their best wishes and Godspeed. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Ulrich Metzner wrote: Have a look at the AR: Richard Haselgrove wrote:
FYI My PC still get/got a few ARs <0.12 for the AMD/ATI GPU app (examples): 0.08x 29ja10aa.19856.74043.4.31.195_0 Created 2 Mar 2016, 3:20:51 UTC Sent 2 Mar 2016, 7:19:26 UTC 0.09x 29ja10aa.19856.75270.4.31.229_1 Created 2 Mar 2016, 3:26:04 UTC Sent 2 Mar 2016, 7:24:48 UTC 29ja10aa.19856.75270.4.31.233_1 Created 2 Mar 2016, 3:26:04 UTC Sent 2 Mar 2016, 7:24:48 UTC |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Others seem to have received some VLAR wu's (0.087915) too. But the 26 min 40 sec from a 750Ti (Mac, Darwin) is not too bad. http://setiathome.berkeley.edu/workunit.php?wuid=2080603401 To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Eric restored the VLAR angle range bound to the original value of 0.12 (for Arecibo recordings) at 21:56 UTC last night (1 Mar 2016) - round about the time the project was brought back up after maintenance. https://setisvn.ssl.berkeley.edu/trac/changeset/3396 Obviously, tasks split before maintenance will still be working their way through the system, but those examples do look suspicious. It's possible that the source code was updated, but new splitters aren't going to be deployed until after testing of the other change. Keep an eye on things, and let us know if you see any more. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Agree still see a few on my cuda machine Tut... Was this with Raistmer's SoG? Looking at the stderr it looks like it was the OpenCl version |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Yea I saw of few on my SoG machine as well. AR 0.08 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
And here, like WU 2081841575 created 3 Mar 2016, 2:00:06 UTC WU true angle range is : 0.070042 I'll drop a line to Eric this afternoon. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Eric says they deployed the new splitter to Beta first just in case, but they'll deploy it here 'soon'. |
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
Eric says they deployed the new splitter to Beta first just in case, but they'll deploy it here 'soon'. LOL, sorry about that... %) Aloha, Uli |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Eric says they deployed the new splitter to Beta first just in case, but they'll deploy it here 'soon'. This new splitter you speak of, does it have strange properties? I'm just looking over the tasks at Beta and pondering what could be causing all those overflows. https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=77980&offset=40 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Dunno, you'd better ask Eric that. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.