GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this?

Author	Message
TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1786751 - Posted: 11 May 2016, 15:43:16 UTC - in response to Message 1786742. I was finally able to compile Petri's code in Yosemite but I'm back to getting a number of these "SIGBUS: bus error". It's a little faster with ToolKit 7.5, it's just strange why I'm getting what appears to be a BOINC Error. The Task is finished, the results printed, then it gives an Error; ... setiathome v8 enhanced x41p_zi r3452-64, Cuda 7.50 special Compiled with NVCC 7.5. Modifications done by petri33. Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.008975 Sigma 372 Sigma > GaussTOffsetStop: 372 > -308 Thread call stack limit is: 1k cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 Flopcounter: 31398940702413.851562 Spike count: 1 Autocorr count: 1 Pulse count: 5 Triplet count: 3 Gaussian count: 0 SIGBUS: bus error Crashed executable name: setiathome_x41p_zi_x86_64-apple-darwin_cuda75 Machine type Intel 80486 (64-bit executable) System version: Macintosh OS 10.10.5 build 14F1713 Wed May 11 10:59:49 2016 ... I've switched back to BOINC 7.4.36, but, I think I've been here and done this before. I've been using <no_priority_change> for years. ID: 1786751 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1786753 - Posted: 11 May 2016, 15:48:47 UTC - in response to Message 1786751. Last modified: 11 May 2016, 15:49:07 UTC Assuming you changed maxregisters up to 64 as you stated elsewhere, try dialling it back to 32. The long pulsefinds in both baseline and Petri's code are greedy, and will dominate in guppi vlars, potentially tripping up OS failsafes. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1786753 ·

Gianfranco Lizzio Volunteer tester Send message Joined: 5 May 99 Posts: 39 Credit: 28,049,113 RAC: 87	Message 1786756 - Posted: 11 May 2016, 15:57:05 UTC - in response to Message 1786753. Assuming you changed maxregisters up to 64 as you stated elsewhere, try dialling it back to 32. I'm using maxrregcount=64 in El Capitan without any problem. Gianfranco I don't want to believe, I want to know! ID: 1786756 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1786757 - Posted: 11 May 2016, 15:58:53 UTC - in response to Message 1786753. I think part of the compiling problem I was having before was trying it with maxregisters=32. I believe it's coded to use maxregisters=64. I don't have this problem using maxregisters=64 with the ToolKit 6.5 App compiled in ML. Also, it doesn't have a problem with the Science App, the App is Finished, and the results printed before the SIGBUS error. ID: 1786757 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1786758 - Posted: 11 May 2016, 15:59:33 UTC - in response to Message 1786756. Assuming you changed maxregisters up to 64 as you stated elsewhere, try dialling it back to 32. I'm using maxrregcount=64 in El Capitan without any problem. Gianfranco Hmmmm, Tbar's Build/Boinc suspicions remain then :), I have no other ideas. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1786758 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1786759 - Posted: 11 May 2016, 16:01:57 UTC - in response to Message 1786757. I think part of the compiling problem I was having before was trying it with maxregisters=32. I believe it's coded to use maxregisters=64. I don't have this problem using maxregisters=64 with the ToolKit 6.5 App compiled in ML. Also, it doesn't have a problem with the Science App, the App is Finished, and the results printed before the SIGBUS error. IOW it's dying after boinc_finish() call ? or being killed by the client (if any way to tell the difference) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1786759 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1786761 - Posted: 11 May 2016, 16:13:43 UTC - in response to Message 1786759. I think part of the compiling problem I was having before was trying it with maxregisters=32. I believe it's coded to use maxregisters=64. I don't have this problem using maxregisters=64 with the ToolKit 6.5 App compiled in ML. Also, it doesn't have a problem with the Science App, the App is Finished, and the results printed before the SIGBUS error. IOW it's dying after boinc_finish() call ? or being killed by the client (if any way to tell the difference) All I know is the last 30 seconds seems to take much longer than 30 secs ;-) So far no errors since going back to 7.4.36. ID: 1786761 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1786763 - Posted: 11 May 2016, 16:18:25 UTC - in response to Message 1786761. Last modified: 11 May 2016, 16:22:44 UTC One thing that comes to mind, is a change at some point in the shared memory arrangement. This may not affect Gianfranco or myself, by may (possibly) affect you. Using app_info somewhere, iirc, you may need to include an explicit <api_version> tag containing the api version number you built with. That's one item that could depend on both the api and client version you chose (though I didn't have to mess with adding the entry on my system,) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1786763 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1786765 - Posted: 11 May 2016, 16:32:41 UTC - in response to Message 1786763. One thing that comes to mind, is a change at some point in the shared memory arrangement. This may not affect Gianfranco or myself, by may (possibly) affect you. Using app_info somewhere, iirc, you may need to include an explicit <api_version> tag containing the api version number you built with. That's one item that could depend on both the api and client version you chose (though I didn't have to mess with adding the entry on my system,) In reality, it doesn't have to be the exact build version: the only tests are for BOINC/API 6.0 (PID instead of heartbeat), and 7.5 (something to do with a Bitcoin Utopia command line). There was also the shared memory segment issue (the old SpyHill bug) which was specific to multi-core, multi-instance Macs, but that would blow up at the start of the run, not at the end. It would be good practice to get into the habit of using <api_version> tags in app_info.xml, just in case somebody starts using one for real and we miss it, but I doubt they're implicated here. ID: 1786765 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1786777 - Posted: 11 May 2016, 17:09:01 UTC - in response to Message 1786763. One thing that comes to mind, is a change at some point in the shared memory arrangement. This may not affect Gianfranco or myself, by may (possibly) affect you. Using app_info somewhere, iirc, you may need to include an explicit <api_version> tag containing the api version number you built with. That's one item that could depend on both the api and client version you chose (though I didn't have to mess with adding the entry on my system,) Hmmm, that doesn't seem to work very well for me. Adding the line <api_version>7.7.0</api_version> results in Many trashed tasks; Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.006956 Sigma 513 Sigma > GaussTOffsetStop: 513 > -449 plan autocorr R2C batched FFT failed 5 Not enough VRAM for Autocorrelations... setiathome_CUDA: CUDA runtime ERROR in device memory allocation, attempt 1 of 6 cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... 1,2,3,4,5,6,7,8,9,10,10,11,12,cudaAcc_free() DONE. 13 waiting 5 seconds... Reinitialising Cuda Device... Cuda error 'Couldn't get cuda device count ' in file 'cuda/cudaAcceleration.cu' in line 161 : invalid resource handle. </stderr_txt> ... Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.005559 Cuda error 'cudaMalloc((void**) &dev_WorkData' in file 'cuda/cudaAcceleration.cu' in line 439 : out of memory. ID: 1786777 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1786780 - Posted: 11 May 2016, 17:26:12 UTC - in response to Message 1786777. Last modified: 11 May 2016, 17:27:04 UTC plan autocorr R2C batched FFT failed 5 Not enough VRAM for Autocorrelations... setiathome_CUDA: CUDA runtime ERROR in device memory allocation, attempt 1 of 6 it has failed at the start... Sometimes a reboooooot helps. The GPU memory can get in to a fragmented state. There is plenty of ram available but not in one continuous block. How about running the older build? On top hosts there is one running happily and without any extra inconclusives. (I have trashed my version and need to revert back because of too many inconclusives, no errors though.) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1786780 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1786783 - Posted: 11 May 2016, 17:31:12 UTC - in response to Message 1786777. Hmmm, that doesn't seem to work very well for me. Adding the line <api_version>7.7.0</api_version> results in Many trashed tasks; That might mess with the command line passed to your application at startup: if (!app_version->api_version_at_least(7, 5)) { int rt = app_version->gpu_usage.rsc_type; if (rt) { coproc_cmdline(rt, result, app_version->gpu_usage.usage, cmdline, sizeof(cmdline)); } } Try knocking it back to 7.3.0 and see if that changes anything. But you should be getting info about the device to use from init_data.xml these days, not from the command line at all. ID: 1786783 ·

Brent Norman Volunteer tester Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835	Message 1786790 - Posted: 11 May 2016, 17:52:51 UTC @Mark 2 hours seems a bit long for those cards. I ran both my computers running 3 tasks on 750Ti's (started at same time) and ended up with run times of 1:29 to 1:39. A bit strange since I was running 1:05 with 2 tasks. My mbcuda.cfg processpriority = abovenormal pfblockspersm = 16 pfperiodsperlaunch = 400 ID: 1786790 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1786792 - Posted: 11 May 2016, 17:55:49 UTC - in response to Message 1786790. @Mark 2 hours seems a bit long for those cards. I ran both my computers running 3 tasks on 750Ti's (started at same time) and ended up with run times of 1:29 to 1:39. A bit strange since I was running 1:05 with 2 tasks. My mbcuda.cfg processpriority = abovenormal pfblockspersm = 16 pfperiodsperlaunch = 400 Well, with work keeping me busy, I have not done any optimizing yet. This weekend, I'll have to see if I can find time to make sure I am running the right apps and add some opti parameters. Hopefully that will get the kitties back on the right track. I'll be posting in the team forum for some help and tips. Thanks for the input. Meow. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1786792 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1786793 - Posted: 11 May 2016, 17:55:53 UTC - in response to Message 1786780. Last modified: 11 May 2016, 18:16:50 UTC Well, rebooting didn't help. I had to remove the line <api_version>7.7.0</api_version> to get it to stop immediately trashing All the GPU tasks. Changing the line to <api_version>7.3.0</api_version> allows it to start again. The current build is using the code in Jason's folder, it seems to be quite a bit faster than the exact same code compiled in ML with ToolKit 6.5. It would be nice if it weren't for the Crash After Finish. I might try building it again with boinc-master 7.5, the App compiled in ML with ToolKit 6.5 was using boinc-master 7.5. Adding <api_version>7.3.0</api_version> didn't help. Still get the Crash After Finish with 1 out of 3 tasks. Why does the validator give Some tasks an Invalid when run a second time? It was Only Reported Once. As you can see, the results are the same, http://setiathome.berkeley.edu/workunit.php?wuid=2155978136 The way I see it the Results you Report are the ones that should be used. Not the ones caught up in a Mass trashing and later run and reported. ID: 1786793 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1786794 - Posted: 11 May 2016, 18:19:45 UTC - in response to Message 1786789. Running Raismter's SoG on my Titan X's Running 3 at a time on each GPU with commandlines from Mike, Each are taking just under 1 hour (56-58 minutes) http://setiathome.berkeley.edu/result.php?resultid=4923517925 http://setiathome.berkeley.edu/result.php?resultid=4923517967 http://setiathome.berkeley.edu/result.php?resultid=4923517729 That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each. http://setiathome.berkeley.edu/result.php?resultid=4924826717 http://setiathome.berkeley.edu/result.php?resultid=4924819417 http://setiathome.berkeley.edu/result.php?resultid=4924799847 Those have ar 0.009, 0.01 and 0.009. Other reporters may have ar at around 0.005 or something. Nice times though (3 ones under 30). To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1786794 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1786795 - Posted: 11 May 2016, 18:23:26 UTC Jus as someone said: Let the creditscrew settle now for a week or thee. I was looking at the credit granted for host http://setiathome.berkeley.edu/results.php?hostid=7939003&offset=0&show_names=0&state=4&appid= and saw a huge difference in credit granted depending on which kind of host the task validated against. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1786795 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1786797 - Posted: 11 May 2016, 18:37:01 UTC - in response to Message 1786789. Running Raismter's SoG on my Titan X's Running 3 at a time on each GPU with commandlines from Mike, Each are taking just under 1 hour (56-58 minutes) http://setiathome.berkeley.edu/result.php?resultid=4923517925 http://setiathome.berkeley.edu/result.php?resultid=4923517967 http://setiathome.berkeley.edu/result.php?resultid=4923517729 That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each. http://setiathome.berkeley.edu/result.php?resultid=4924826717 http://setiathome.berkeley.edu/result.php?resultid=4924819417 http://setiathome.berkeley.edu/result.php?resultid=4924799847 Are you dividing by 3 or is that the exact time? 3 in 54 minutes for me averages 18 minutes ID: 1786797 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1786799 - Posted: 11 May 2016, 18:44:55 UTC - in response to Message 1786797. Last modified: 11 May 2016, 18:50:32 UTC Running Raismter's SoG on my Titan X's Running 3 at a time on each GPU with commandlines from Mike, Each are taking just under 1 hour (56-58 minutes) http://setiathome.berkeley.edu/result.php?resultid=4923517925 http://setiathome.berkeley.edu/result.php?resultid=4923517967 http://setiathome.berkeley.edu/result.php?resultid=4923517729 That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each. EDIT: A retake: all were run on GPU 0. So 29 min fore each. Not so good. http://setiathome.berkeley.edu/result.php?resultid=4924826717 http://setiathome.berkeley.edu/result.php?resultid=4924819417 http://setiathome.berkeley.edu/result.php?resultid=4924799847 Are you dividing by 3 or is that the exact time? 3 in 54 minutes for me averages 18 minutes I clicked the links and saw about 29 min for each three of them. So running one at a time that would be about 10 min for one. EDIT: but they'd have to run on same GPU simultaneously., To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1786799 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1786802 - Posted: 11 May 2016, 18:53:54 UTC - in response to Message 1786801. Last modified: 11 May 2016, 18:54:26 UTC Running Raismter's SoG on my Titan X's Running 3 at a time on each GPU with commandlines from Mike, Each are taking just under 1 hour (56-58 minutes) http://setiathome.berkeley.edu/result.php?resultid=4923517925 http://setiathome.berkeley.edu/result.php?resultid=4923517967 http://setiathome.berkeley.edu/result.php?resultid=4923517729 That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each. http://setiathome.berkeley.edu/result.php?resultid=4924826717 http://setiathome.berkeley.edu/result.php?resultid=4924819417 http://setiathome.berkeley.edu/result.php?resultid=4924799847 Are you dividing by 3 or is that the exact time? 3 in 54 minutes for me averages 18 minutes Not dividing as you can see from the results. It is the exact times. Each of them is done in less than 30 minutes. Name blc2_2bit_guppi_57451_24929_HIP63406_0019.20556.831.17.26.163.vlar_0 Run time 29 min 10 sec CPU time 27 min 50 sec -------------------------------------------------------------------------- Name blc2_2bit_guppi_57451_25284_HIP63406_OFF_0020.20361.831.17.26.177.vlar_1 Run time 28 min 25 sec CPU time 27 min 48 sec ------------------------------------------------------------------------------ Name blc2_2bit_guppi_57451_24929_HIP63406_0019.17306.831.18.27.176.vlar_1 Run time 28 min 32 sec CPU time 25 min 40 sec ---------------------------------------------------------------------------- "Ja, men tider dom var uppladdat Ã¤r inte samma. Dom var nÃ¥got interlaced." Yes, but the times they were uploaded are not the same. They were somewhat interlaced. (Not runnig the whole time together, and may have had other kind of tasks running at the same time) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1786802 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.