VLARs now also sent to CUDA app?

Author	Message
Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1765148 - Posted: 15 Feb 2016, 9:54:16 UTC Hi there, i recently got a VLAR wu on the CUDA 5.0 app: http://setiathome.berkeley.edu/result.php?resultid=4719989683 Since it will be gone in a few hours, here the stdout.txt: Name 15oc15ab.6106.10701.8.35.245_1 Workunit 2059707558 Created 11 Feb 2016, 9:14:20 UTC Sent 11 Feb 2016, 12:12:03 UTC Report deadline 2 Apr 2016, 12:01:09 UTC Received 15 Feb 2016, 0:41:20 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 157931 Run time 5 hours 12 min 37 sec CPU time 6 min 5 sec Validate state Valid Credit 175.29 Device peak FLOPS 692.35 GFLOPS Application version SETI@home v8 Anonymous platform (NVIDIA GPU) Peak working set size 91.43 MB Peak swap size 127.82 MB Peak disk usage 0.03 MB Stderr output <core_client_version>7.6.22</core_client_version> <![CDATA[ <stderr_txt> v8 task detected setiathome_CUDA: Found 2 CUDA device(s): nVidia Driver Version 361.75 Device 1: GeForce GT 640, 2048 MiB, regsPerBlock 65536 computeCap 3.0, multiProcs 2 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GT 430, 512 MiB, regsPerBlock 32768 computeCap 2.1, multiProcs 2 pciBusID = 5, pciSlotID = 0 clockRate = 1400 MHz In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce GT 640 is okay SETI@home using CUDA accelerated device GeForce GT 640 mbcuda.cfg, matching pci device processpriority key detected mbcuda.cfg, matching pci device pfblockspersm key detected pulsefind: blocks per SM 2 mbcuda.cfg, matching pci device pfperiodsperlaunch key detected pulsefind: periods per launch 100 (default) Priority of process set to BELOW_NORMAL (default) successfully Priority of worker thread set successfully setiathome enhanced x41zi (baseline v8), Cuda 5.00 setiathome_v8 task detected Detected Autocorrelations as enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.080614 GPU current clockRate = 901 MHz re-using dev_GaussFitResults array for dev_AutoCorrIn, 4194304 bytes re-using dev_GaussFitResults+524288x8 array for dev_AutoCorrOut, 4194304 bytes Thread call stack limit is: 1k Exit Status: 0 boinc_exit(): requesting safe worker shutdown -> Worker Acknowledging exit request, spinning-> boinc_exit(): received safe worker shutdown acknowledge -> Cuda threadsafe ExitProcess() initiated, rval 0 v8 task detected setiathome_CUDA: Found 2 CUDA device(s): nVidia Driver Version 361.75 Device 1: GeForce GT 640, 2048 MiB, regsPerBlock 65536 computeCap 3.0, multiProcs 2 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GT 430, 512 MiB, regsPerBlock 32768 computeCap 2.1, multiProcs 2 pciBusID = 5, pciSlotID = 0 clockRate = 1400 MHz In cudaAcc_initializeDevice(): Boinc passed DevPref 2 setiathome_CUDA: CUDA Device 2 specified, checking... Device 2: GeForce GT 430 is okay SETI@home using CUDA accelerated device GeForce GT 430 mbcuda.cfg, matching pci device processpriority key detected mbcuda.cfg, matching pci device pfblockspersm key detected pulsefind: blocks per SM 4 (Fermi or newer default) mbcuda.cfg, matching pci device pfperiodsperlaunch key detected pulsefind: periods per launch 200 Priority of process set to BELOW_NORMAL (default) successfully Priority of worker thread set successfully setiathome enhanced x41zi (baseline v8), Cuda 5.00 setiathome_v8 task detected Detected Autocorrelations as enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.080614 re-using dev_GaussFitResults array for dev_AutoCorrIn, 4194304 bytes re-using dev_GaussFitResults+524288x8 array for dev_AutoCorrOut, 4194304 bytes Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... cudaAcc_free() DONE. Flopcounter: 63430621422201.633000 Spike count: 1 Autocorr count: 0 Pulse count: 7 Triplet count: 0 Gaussian count: 0 Worker preemptively acknowledging a normal exit.-> called boinc_finish Exit Status: 0 boinc_exit(): requesting safe worker shutdown -> boinc_exit(): received safe worker shutdown acknowledge -> Cuda threadsafe ExitProcess() initiated, rval 0 </stderr_txt> ]]> Have a look at the AR: "WU true angle range is : 0.080614" This WU ran nearly "forever" compared to other WUs and i had a lot of lags while this was running. Is this a new adjustment to the server routine? Aloha, Uli ID: 1765148 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1765150 - Posted: 15 Feb 2016, 10:19:31 UTC - in response to Message 1765148. Have a look at the AR: "WU true angle range is : 0.080614" This WU ran nearly "forever" compared to other WUs and i had a lot of lags while this was running. Is this a new adjustment to the server routine? Yes, that matches the discussion and conclusion we reached a week ago. A small arithmetic error in preparing the splitters to handle Green Bank (and other) telescope data shifted the threshhold for definition as a VLAR down by 50% - from 0.12 down to 0.06 (see Panic Mode On (102) Server Problems?) Eric did reply "I'll fix it during this week's outage", but evidently something intervened and it dropped off the ToDo list. I'll remind him tomorrow, closer to the likely timeframe for action. (Today is a Federal Holiday - Presidents' Day - in the USA, so no point in writing today.) ID: 1765150 ·

Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1765159 - Posted: 15 Feb 2016, 11:57:32 UTC Richard, thanks for the info! I must have overlooked that in the panic thread. Aloha, Uli ID: 1765159 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1765280 - Posted: 15 Feb 2016, 22:31:32 UTC Yes they sure are slow: 0.06 in 14 minutes. Name 14oc11af.23215.148572.15.42.51_0 Workunit 2059161656 Created 10 Feb 2016, 21:26:02 UTC Sent 11 Feb 2016, 1:53:31 UTC Report deadline 4 Apr 2016, 21:29:56 UTC Received 11 Feb 2016, 4:59:36 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 7475713 Run time 14 min 5 sec CPU time 2 min Validate state Valid Credit 105.25 Device peak FLOPS 7,698.43 GFLOPS Application version SETI@home v8 Anonymous platform (NVIDIA GPU) Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 2, pciSlotID = 0 Device 3: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 4 setiathome_CUDA: CUDA Device 4 specified, checking... Device 4: GeForce GTX 780 is okay SETI@home using CUDA accelerated device GeForce GTX 780 Using pfb = 64 from command line args Using pfp = 3 from command line args setiathome v8 enhanced x41p_zm, Cuda 7.50 special Compiled with NVCC 7.5, using 6.5 libraries. Modifications done by petri33. Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.068821 Sigma 20 Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 Flopcounter: 54084954807375.851562 Spike count: 4 Autocorr count: 0 Pulse count: 14 Triplet count: 4 Gaussian count: 0 06:55:48 (1116): called boinc_finish(0) </stderr_txt> ]]> To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1765280 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 1765297 - Posted: 15 Feb 2016, 23:56:07 UTC - in response to Message 1765280. The reason for blocking VLARs from NVidia GPUs was that they caused significant system issues (screen lag, system becomes very sluggish/unresponsive). Other than taking a long time to crunch (sometimes longer than the estimated times) my systems aren't showing any signs of sluggishness or lack of responsiveness. So personally I wouldn't have any issue with them as long as the credit given reflected the work done. Unfortunately that doesn't appear to be the case. 1,455.32 secs 66.27 credits 1,486.28 secs 101.89 credits 1,645.92 secs 123.29 credits 1,716.61 secs 77.68 credits 2,006.53 secs 91.27 credits 2,066.25 secs 88.43 credits 2,080.84 secs 109.99 credits 2,204.44 secs 103.19 credits Grant Darwin NT ID: 1765297 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1765300 - Posted: 16 Feb 2016, 0:10:35 UTC That's most definitely the very next challenge, as soon as the 3 main platforms, and possibly a fourth, Jetson TK1 I hear by PM today someone is working on :), come into lockstep. From there we basically make Petri style streaming + CPU use + some other special optimisations configurable to how a user wants to run (with special tools to help decide how best to do so, minimal nerdiness required) Big long drawn out process for me, but I think worth it in the long run, over churning out builds with compromises. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1765300 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1765492 - Posted: 16 Feb 2016, 15:24:43 UTC The pay is not always that bad. But compared to 40 points in 70 seconds for shoties there is still something. This is a 0.07 in 11 minutes (227 credits). Name 19mr11ad.15949.4984.13.40.98_1 Workunit 2065134343 Created 16 Feb 2016, 4:44:32 UTC Sent 16 Feb 2016, 9:40:38 UTC Report deadline 9 Apr 2016, 19:38:48 UTC Received 16 Feb 2016, 14:45:14 UTC Server state Over Outcome Success Client state Done Exit status 0 (0x0) Computer ID 7475713 Run time 11 min 12 sec CPU time 2 min 8 sec Validate state Valid Credit 227.85 Device peak FLOPS 7,698.43 GFLOPS Application version SETI@home v8 Anonymous platform (NVIDIA GPU) Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 2, pciSlotID = 0 Device 3: GeForce GTX 980, 4095 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 780, 3071 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 12 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce GTX 980 is okay SETI@home using CUDA accelerated device GeForce GTX 980 Using pfb = 64 from command line args Using pfp = 3 from command line args setiathome v8 enhanced x41p_zm, Cuda 7.50 special Compiled with NVCC 7.5, using 6.5 libraries. Modifications done by petri33. Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.070283 Sigma 19 Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 Flopcounter: 53215262177658.898438 Spike count: 4 Autocorr count: 1 Pulse count: 7 Triplet count: 2 Gaussian count: 0 16:44:55 (22723): called boinc_finish(0) </stderr_txt> ]]> To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1765492 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1765495 - Posted: 16 Feb 2016, 15:28:07 UTC And the kitties continue to crunch whatever is sent to them with the best apps available to them. Without complaint. Meow. Although they do send the optimizers their best wishes and Godspeed. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1765495 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1769011 - Posted: 2 Mar 2016, 16:28:58 UTC - in response to Message 1765150. Last modified: 2 Mar 2016, 16:32:51 UTC Ulrich Metzner wrote: Have a look at the AR: "WU true angle range is : 0.080614" This WU ran nearly "forever" compared to other WUs and i had a lot of lags while this was running. Is this a new adjustment to the server routine? Richard Haselgrove wrote: Yes, that matches the discussion and conclusion we reached a week ago. A small arithmetic error in preparing the splitters to handle Green Bank (and other) telescope data shifted the threshhold for definition as a VLAR down by 50% - from 0.12 down to 0.06 (see Panic Mode On (102) Server Problems?) Eric did reply "I'll fix it during this week's outage", but evidently something intervened and it dropped off the ToDo list. I'll remind him tomorrow, closer to the likely timeframe for action. (Today is a Federal Holiday - Presidents' Day - in the USA, so no point in writing today.) FYI My PC still get/got a few ARs <0.12 for the AMD/ATI GPU app (examples): 0.08x 29ja10aa.19856.74043.4.31.195_0 Created 2 Mar 2016, 3:20:51 UTC Sent 2 Mar 2016, 7:19:26 UTC 0.09x 29ja10aa.19856.75270.4.31.229_1 Created 2 Mar 2016, 3:26:04 UTC Sent 2 Mar 2016, 7:24:48 UTC 29ja10aa.19856.75270.4.31.233_1 Created 2 Mar 2016, 3:26:04 UTC Sent 2 Mar 2016, 7:24:48 UTC ID: 1769011 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1769013 - Posted: 2 Mar 2016, 16:51:23 UTC - in response to Message 1769011. Others seem to have received some VLAR wu's (0.087915) too. But the 26 min 40 sec from a 750Ti (Mac, Darwin) is not too bad. http://setiathome.berkeley.edu/workunit.php?wuid=2080603401 To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1769013 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1769015 - Posted: 2 Mar 2016, 16:55:51 UTC - in response to Message 1769011. Eric restored the VLAR angle range bound to the original value of 0.12 (for Arecibo recordings) at 21:56 UTC last night (1 Mar 2016) - round about the time the project was brought back up after maintenance. https://setisvn.ssl.berkeley.edu/trac/changeset/3396 Obviously, tasks split before maintenance will still be working their way through the system, but those examples do look suspicious. It's possible that the source code was updated, but new splitters aren't going to be deployed until after testing of the other change. Keep an eye on things, and let us know if you see any more. ID: 1769015 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1769044 - Posted: 2 Mar 2016, 19:33:30 UTC - in response to Message 1769032. Last modified: 2 Mar 2016, 19:38:31 UTC Agree still see a few on my cuda machine Tut... Was this with Raistmer's SoG? Looking at the stderr it looks like it was the OpenCl version ID: 1769044 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1769065 - Posted: 2 Mar 2016, 20:54:10 UTC - in response to Message 1769058. Yea I saw of few on my SoG machine as well. AR 0.08 ID: 1769065 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1769220 - Posted: 3 Mar 2016, 12:18:15 UTC And here, like WU 2081841575 created 3 Mar 2016, 2:00:06 UTC WU true angle range is : 0.070042 I'll drop a line to Eric this afternoon. ID: 1769220 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1769261 - Posted: 3 Mar 2016, 16:53:59 UTC - in response to Message 1769220. Eric says they deployed the new splitter to Beta first just in case, but they'll deploy it here 'soon'. ID: 1769261 ·

Ulrich Metzner Volunteer tester Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13	Message 1769363 - Posted: 4 Mar 2016, 0:11:47 UTC - in response to Message 1769261. Eric says they deployed the new splitter to Beta first just in case, but they'll deploy it here 'soon'. LOL, sorry about that... %) Aloha, Uli ID: 1769363 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1769549 - Posted: 4 Mar 2016, 20:06:52 UTC - in response to Message 1769261. Eric says they deployed the new splitter to Beta first just in case, but they'll deploy it here 'soon'. This new splitter you speak of, does it have strange properties? I'm just looking over the tasks at Beta and pondering what could be causing all those overflows. https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=77980&offset=40 ID: 1769549 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1769552 - Posted: 4 Mar 2016, 20:14:44 UTC - in response to Message 1769549. Dunno, you'd better ask Eric that. ID: 1769552 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.