Panic Mode On (84) Server Problems?

Author	Message
arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1373211 - Posted: 30 May 2013, 1:08:18 UTC In honor of Seti@Home v7 roll out, it is time for a new thread. ID: 1373211 ·

Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3	Message 1373213 - Posted: 30 May 2013, 1:11:41 UTC - in response to Message 1373211. In honor of Seti@Home v7 roll out, it is time for a new thread. +1.....hehehehe ID: 1373213 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1373219 - Posted: 30 May 2013, 1:38:09 UTC Last modified: 30 May 2013, 1:39:47 UTC V7 running, but it seems something is not working fine, on a 2x690 Hosts it runs cuda42 and some cuda32 and not the right one for this type of GPU (expect to run cuda50 or iÂ´m wrong?). http://setiathome.berkeley.edu/show_host_detail.php?hostid=6269362 Any clue? ID: 1373219 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1373222 - Posted: 30 May 2013, 1:41:44 UTC - in response to Message 1373219. Last modified: 30 May 2013, 1:47:30 UTC Any clue? The server needs to try each compatible version (gather statistics), to determine which is best. This should converge on Cuda5 for those after many tasks. If it converges on the wrong one, there will be a reset statistics button (at some stage). "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1373222 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1373224 - Posted: 30 May 2013, 1:48:53 UTC - in response to Message 1373222. Last modified: 30 May 2013, 1:49:26 UTC Any clue? The server needs to try each compatible version, to determine which is best. This should converge on Cuda5 for those after many tasks. If it converges on the wrong one, there will be a reset statistics button (at some stage). So a select version option in the app_config.xml could be a good ideia... That will be a long night/day for you guys... hope you all have a good beer & coffee stock to help. ID: 1373224 ·

SciManStev Volunteer tester Send message Joined: 20 Jun 99 Posts: 6652 Credit: 121,090,076 RAC: 0	Message 1373226 - Posted: 30 May 2013, 1:51:16 UTC I am running stock now, and doing exactly what Jason suggests. It will all balance out in the long run, and with a project that has the potential for going long past my expected life time, I am happy. It will all balance out on it's own. Then the tweaking begins...... Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website ID: 1373226 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1373228 - Posted: 30 May 2013, 1:55:40 UTC - in response to Message 1373224. Any clue? The server needs to try each compatible version, to determine which is best. This should converge on Cuda5 for those after many tasks. If it converges on the wrong one, there will be a reset statistics button (at some stage). So a select version option in the app_config.xml could be a good ideia... That will be a long night/day for you guys... hope you all have a good beer & coffee stock to help. Forcing an application version can already be done with app_info.xml, as the installers will do. From the server perspective it needs to have your knowledge about versions, but is a blank slate. For credits to dial in, and your APRs correctly confirm what you already know (or break horribly), best to let it run stock for a while & see if it works out sensible numbers, or David & Eric need to be locked in a small dark room together until they work it out :D Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1373228 ·

tbret Volunteer tester Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40	Message 1373229 - Posted: 30 May 2013, 1:56:09 UTC - in response to Message 1373222. The server needs to try each compatible version (gather statistics), to determine which is best. We could save it a lot of trouble if that were an option. ID: 1373229 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1373230 - Posted: 30 May 2013, 1:59:49 UTC - in response to Message 1373226. I am running stock now, and doing exactly what Jason suggests. It will all balance out in the long run, and with a project that has the potential for going long past my expected life time, I am happy. It will all balance out on it's own. Then the tweaking begins...... Steve Steve, that is too damn rational. ID: 1373230 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1373231 - Posted: 30 May 2013, 2:01:02 UTC - in response to Message 1373229. Last modified: 30 May 2013, 2:01:34 UTC The server needs to try each compatible version (gather statistics), to determine which is best. We could save it a lot of trouble if that were an option. As per previous post, yeah you already have that option with app_info.xml. In the short term, It's more about dialling in credits, which will probably be all over the place for some time. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1373231 ·

Fred E. Volunteer tester Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0	Message 1373239 - Posted: 30 May 2013, 2:25:22 UTC In the previous thread, there were some comments about VLARs going to Nvidia GPU's now. I"m still on v6 for another 10 hours while draining my cache, and I notice that I have 3 VLARs and a non-VLAR now running on my 670 with x41zc, Cuda 5.00. I don't see any adverse effects except that run times will be longer than normal and there is a little lag in the system (responsible for any typos in this post!) :) But the odd thing is that my gpu temperature is more than 10 degrees below normal. No downclock and gpu utilization is at a constant 99%. CPU usage is below normal for 6.10 tasks. Why would these run cooler? I expected the opposite. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ID: 1373239 ·

ExchangeMan Volunteer tester Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0	Message 1373242 - Posted: 30 May 2013, 2:36:35 UTC - in response to Message 1373239. In the previous thread, there were some comments about VLARs going to Nvidia GPU's now. I"m still on v6 for another 10 hours while draining my cache, and I notice that I have 3 VLARs and a non-VLAR now running on my 670 with x41zc, Cuda 5.00. I don't see any adverse effects except that run times will be longer than normal and there is a little lag in the system (responsible for any typos in this post!) :) But the odd thing is that my gpu temperature is more than 10 degrees below normal. No downclock and gpu utilization is at a constant 99%. CPU usage is below normal for 6.10 tasks. Why would these run cooler? I expected the opposite. I see the exact same thing. My guess is that since VLARs don't do well parallelizing, you can't keep as many cores busy in the GPU. However, Precision X reports high CPU usage, especially with another task running on that same GPU. Less cores in use mean less heat. I can see this is going to be a problem since not only does the VLAR run much longer compared to a normal work unit, but it degrades the other jobs running on that same GPU. I don't know if there is a workaround to this except for only running a single tasks at a time on all my GPUs. That's doesn't seem to be a very efficient use of GPU resources. I would much rather the GPU stay on the CPUs which do pretty well with them. I wouldn't care if all my CPU tasks were VLARs. ID: 1373242 ·

arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1373246 - Posted: 30 May 2013, 2:43:59 UTC - in response to Message 1373242. In the previous thread, there were some comments about VLARs going to Nvidia GPU's now. I"m still on v6 for another 10 hours while draining my cache, and I notice that I have 3 VLARs and a non-VLAR now running on my 670 with x41zc, Cuda 5.00. I don't see any adverse effects except that run times will be longer than normal and there is a little lag in the system (responsible for any typos in this post!) :) But the odd thing is that my gpu temperature is more than 10 degrees below normal. No downclock and gpu utilization is at a constant 99%. CPU usage is below normal for 6.10 tasks. Why would these run cooler? I expected the opposite. I see the exact same thing. My guess is that since VLARs don't do well parallelizing, you can't keep as many cores busy in the GPU. However, Precision X reports high CPU usage, especially with another task running on that same GPU. Less cores in use mean less heat. I can see this is going to be a problem since not only does the VLAR run much longer compared to a normal work unit, but it degrades the other jobs running on that same GPU. I don't know if there is a workaround to this except for only running a single tasks at a time on all my GPUs. That's doesn't seem to be a very efficient use of GPU resources. I would much rather the GPU stay on the CPUs which do pretty well with them. I wouldn't care if all my CPU tasks were VLARs. I have something weird going on with my 2 machines, they were assigned VLAR but they are showing up as suspended by user and I did not suspend them. ID: 1373246 ·

Dave Stegner Volunteer tester Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27	Message 1373252 - Posted: 30 May 2013, 2:55:14 UTC Is there a need to change preferences for amount of work and additional work with V7 ? I seem to remember they are now backwards, or have I read too many posts. Dave ID: 1373252 ·

Fred E. Volunteer tester Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0	Message 1373257 - Posted: 30 May 2013, 3:10:51 UTC Is there a need to change preferences for amount of work and additional work with V7 ? I seem to remember they are now backwards, or have I read too many posts. ____________ Dave Thinbk you're remembering the difference in work fetch settings between BOINC 6(and earlier) and BOINC 7 where you have to swtich the settings. Unless you upgrade BOINC, there's no need to change those settings. You do need to make sure SETIatHome v7 is selected in your website project preferences. Update on my earlier VLAR comment. Lag time got out of hand - had trouble making that post. Dropped down to 3 at a time (2 VLARS) and it is still bad. Took 10 seconds to open this thread. VLAR's on Nvidia aren't going to work for me. Also has adverse impact on other gpu tasks - their run time is abnormally long. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ID: 1373257 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1373259 - Posted: 30 May 2013, 3:16:32 UTC I agree the Nvidias does not like the VLARS... Is there any configuration we could do in order to avoid the GPU receive the VLARS? ID: 1373259 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65736 Credit: 55,293,173 RAC: 49	Message 1373260 - Posted: 30 May 2013, 3:18:23 UTC Me I'm working thru My cpu cache, before I can switch over, so far that's 35 hours of cpu work or 96 wu's, plus 3 I'm working on now. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1373260 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1373264 - Posted: 30 May 2013, 3:43:34 UTC - in response to Message 1373257. Last modified: 30 May 2013, 3:46:25 UTC ...Update on my earlier VLAR comment. Lag time got out of hand - had trouble making that post. Dropped down to 3 at a time (2 VLARS) and it is still bad. Took 10 seconds to open this thread. VLAR's on Nvidia aren't going to work for me. Also has adverse impact on other gpu tasks - their run time is abnormally long. Don't send those things to AMDs either. I tried a few on my 6850 with MB7_win_x86_SSE_OpenCL_ATi_HD5_r1817.exe. They work, but, the computer has 'spikes' of unresponsiveness. You can actually see it in the SIV CPU meter as a clear line every 30 seconds or so. That is with the period of iterations set at 32. Not to mention they took ~40 minutes to complete. The 6850 does an unblanked AP in less time. The GPU temp was lower, so was the credits... ID: 1373264 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1373268 - Posted: 30 May 2013, 3:56:13 UTC - in response to Message 1373259. I agree the Nvidias does not like the VLARS... My Nvidias don't have a problem with them. Is there any configuration we could do in order to avoid the GPU receive the VLARS? ID: 1373268 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1373276 - Posted: 30 May 2013, 4:34:25 UTC - in response to Message 1373239. Last modified: 30 May 2013, 4:52:50 UTC In the previous thread, there were some comments about VLARs going to Nvidia GPU's now. I"m still on v6 for another 10 hours while draining my cache, and I notice that I have 3 VLARs and a non-VLAR now running on my 670 with x41zc, Cuda 5.00. I don't see any adverse effects except that run times will be longer than normal and there is a little lag in the system (responsible for any typos in this post!) :) But the odd thing is that my gpu temperature is more than 10 degrees below normal. No downclock and gpu utilization is at a constant 99%. CPU usage is below normal for 6.10 tasks. Why would these run cooler? I expected the opposite. It's surprising these class of GPUs didn't show pressure here under Beta test with the expected new 2 task per GPU optimum. If the reported experiences match the general consensus on these cards, I would request review of either: - Removing VLARs from being sent to these GPUs, OR - a change in default settings, OR - an Opt-in/Opt-out feature. [e.g. My own aging Core2Duo with GTX 680 happily crunches them while watching the Starship Troopers Trilogy, I'd like to crunch them because they are longer & should hopefully get more credit] In General there are a few things to be aware of (VLAR or not): - V7 does new processing (Autocorrelations) that changes the dynamics quite substantially, including making all task times longer, not comparable to V6. - If you were running 3, 4 or more tasks on the same GPU before, that is quite likely too many under V7. Autocorrelations are very memory intensive, reduce it to 2 at once per device. This is the 'main' reason for running cooler. VLAR in particular: - will be noticeable if you have too many running at once. If you experience any display lag with these, reduce the # of instances from 4 or 3 to 2. - If problems persist, suspect 'system overcommit'. try the following settings in the empty supplied cfg file for the app: [mbcuda] processpriority = normal pfblockspersm = 4 pfperiodsperlaunch = 50 Which are settings overrides to improve CPU responsiveness to the app, while reducing pressure from VLAR specific pulsefind loadings. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1373276 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.