Panic Mode On (71) Server problems?

Message boards : Number crunching : Panic Mode On (71) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

AuthorMessage
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1204877 - Posted: 11 Mar 2012, 12:59:50 UTC - in response to Message 1204876.  
Last modified: 11 Mar 2012, 13:01:21 UTC

I had no idea there was a stock Cuda AP application for Linux hosts.


No, it's worse... I'm talking about the Windows stock Cuda app for pre-Fermi cards. And should have said so, of course:)

Phew. The other option is that I am even more crazy then I think I am. However I think it is clear that there is a bit a wonkyness across the stock apps. Which hopefully will not exist once everything goes v7.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1204877 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1204888 - Posted: 11 Mar 2012, 13:53:57 UTC - in response to Message 1204877.  

REMEMBER Murphy.. And dont put the mockers on it:-)

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1204888 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1204890 - Posted: 11 Mar 2012, 14:03:21 UTC - in response to Message 1204787.  
Last modified: 11 Mar 2012, 14:10:04 UTC

Two loose BNC cable-ends lying on the floor, next to the desk where the receptionist's computer had been...

Yep, I know that feeling! -- From working as a Machine Scientist at large collider labs where the users used to move the data acquisition computers (e.g. VAXStation 4000s) around with gay abandon and scant regard for the Ethernet cables. Worst is when someone clones a machine and you end up with two computers both claiming the same IP (giving my age away there, as that was obviously pre-DHCP...).
ID: 1204890 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1204994 - Posted: 11 Mar 2012, 20:40:00 UTC

Humm, I'm not sure if I'm imagining this or not, but after a failed attempt to get work, and the usual timeout, if you generate a user request, it seems that where it was a case of No tasks available, suddenly there are tasks available, and in reasonable [to me] quantity.
I've had 3 occasions now where boinc has asked to GPU tasks and been fobbed off with no tasks and when I generated a user update, I got 12 GPU tasks..
Once is happenstance, twice is co-incidence, but 3 times?

Anyway I though it worthy of comment:-)

Cheers,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1204994 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65794
Credit: 55,293,173
RAC: 49
United States
Message 1205027 - Posted: 11 Mar 2012, 22:20:13 UTC
Last modified: 11 Mar 2012, 22:34:33 UTC

On a 460 I'm seeing 26-28 minutes for each gpu mb wu to be completed in, sometimes less of course, like 17 or 5-6 mins... And this is in Win7 x64 in an optimized x41g install.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1205027 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1205040 - Posted: 11 Mar 2012, 23:06:41 UTC - in response to Message 1205027.  

yup, had lot of 3 min GPU tasks, some 9 min ones and damm few 25 min ones.. And a few sub 2 min ones...looks like a load of 2.1 credit stuff:-)

Cheers,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1205040 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1205043 - Posted: 11 Mar 2012, 23:30:01 UTC - in response to Message 1205029.  

And now we can celebrate that it's only 39,785 AP's out in the wild. Let's see how close to zero we will come before the new AP app is released and we can start splitting AP again.

I am going with -0 for my guess.

In reality they might hold off until the new servers all humming along happily again.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1205043 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1205063 - Posted: 12 Mar 2012, 0:44:34 UTC - in response to Message 1205029.  

And now we can celebrate that it's only 39,785 AP's out in the wild. Let's see how close to zero we will come before the new AP app is released and we can start splitting AP again.

If it's anything like when we went from the first-generation to _v5, they turned the splitters off for WEEKS to get as close to zero as they could before turning the new version on.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1205063 · Report as offensive
AndrewM
Volunteer tester

Send message
Joined: 5 Jan 08
Posts: 369
Credit: 34,275,196
RAC: 0
Australia
Message 1205068 - Posted: 12 Mar 2012, 0:52:12 UTC - in response to Message 1205063.  
Last modified: 12 Mar 2012, 0:53:00 UTC

And now we can celebrate that it's only 39,785 AP's out in the wild. Let's see how close to zero we will come before the new AP app is released and we can start splitting AP again.

If it's anything like when we went from the first-generation to _v5, they turned the splitters off for WEEKS to get as close to zero as they could before turning the new version on.


How many project staff were full-time or part-time back then?
AndrewM
ID: 1205068 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34984
Credit: 261,360,520
RAC: 489
Australia
Message 1205069 - Posted: 12 Mar 2012, 0:57:40 UTC - in response to Message 1205063.  

And now we can celebrate that it's only 39,785 AP's out in the wild. Let's see how close to zero we will come before the new AP app is released and we can start splitting AP again.

If it's anything like when we went from the first-generation to _v5, they turned the splitters off for WEEKS to get as close to zero as they could before turning the new version on.

I think that you'll find that they wait until all those v505's are out of the system first before they upgrade to v6 and then they'll fire the splitters back up. ;)

Cheers.
ID: 1205069 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1205091 - Posted: 12 Mar 2012, 4:01:54 UTC - in response to Message 1205027.  

On a 460 I'm seeing 26-28 minutes for each gpu mb wu to be completed in, sometimes less of course, like 17 or 5-6 mins... And this is in Win7 x64 in an optimized x41g install.


What clock speeds and how many WU at a time? I run mine at 850mhz on air, and 888mhz on water. 2 WU's at a time on all of them. The 888mhz cards get them done in about 20min, the 850mhz about a minute slower.
ID: 1205091 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65794
Credit: 55,293,173
RAC: 49
United States
Message 1205101 - Posted: 12 Mar 2012, 6:04:08 UTC - in response to Message 1205091.  
Last modified: 12 Mar 2012, 6:27:56 UTC

On a 460 I'm seeing 26-28 minutes for each gpu mb wu to be completed in, sometimes less of course, like 17 or 5-6 mins... And this is in Win7 x64 in an optimized x41g install.


What clock speeds and how many WU at a time? I run mine at 850mhz on air, and 888mhz on water. 2 WU's at a time on all of them. The 888mhz cards get them done in about 20min, the 850mhz about a minute slower.

2 on air, 779MHz core, shader 1558MHz and memory 2010MHz, this evening I just updated the chipset from 2006 MS drivers to 2009 Intel drivers, video uses 290.53 on W7 x64, which meant I had to reset the PC. I've tried 800MHz core and the driver will crash when I play Civ4(with S@H disabled of course), the core voltage is at 1000mV.

SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31627.5384.13.10.38_0 00:17:31 (00:01:11) 3/11/2012 10:37:32 PM 3/11/2012 10:41:21 PM 0.04C + 0.50NV Reported: OK *
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31627.5384.13.10.43_1 00:17:46 (00:01:14) 3/11/2012 10:31:46 PM 3/11/2012 10:36:09 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ag.30610.3339.8.10.151_0 00:09:26 (00:00:43) 3/11/2012 8:48:02 PM 3/11/2012 10:22:33 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ag.30610.3339.8.10.148_1 00:02:16 (00:00:12) 3/11/2012 6:26:16 PM 3/11/2012 6:28:16 PM 0.04C + 0.50NV Reported: OK *
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.203_0 00:18:15 (00:01:14) 3/11/2012 6:23:55 PM 3/11/2012 6:28:16 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 10ja12aa.26593.885.12.10.104_1 00:00:26 (00:00:02) 3/11/2012 6:23:07 PM 3/11/2012 6:28:16 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.210_0 00:17:28 (00:01:12) 3/11/2012 6:22:22 PM 3/11/2012 6:23:07 PM 0.04C + 0.50NV Reported: OK *
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.208_0 00:00:36 (00:00:03) 3/11/2012 6:01:43 PM 3/11/2012 6:07:07 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.223_0 00:18:44 (00:00:50) 3/11/2012 4:42:29 PM 3/11/2012 6:01:43 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.214_0 00:05:09 (00:00:17) 3/11/2012 4:42:29 PM 3/11/2012 6:01:43 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.225_0 00:29:23 (00:01:13) 3/11/2012 4:25:06 PM 3/11/2012 4:27:10 PM 0.04C + 0.50NV Reported: OK *
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.228_0 00:29:28 (00:01:14) 3/11/2012 4:11:43 PM 3/11/2012 4:17:47 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31327.2112.12.10.230_0 00:29:20 (00:01:14) 3/11/2012 3:55:41 PM 3/11/2012 4:01:46 PM 0.04C + 0.50NV Reported: OK
SETI@home 6.10 setiathome_enhanced (cuda_fermi) 09ja12ad.31627.885.13.10.122_0 00:29:27 (00:01:13) 3/11/2012 3:42:11 PM 3/11/2012 3:50:10 PM 0.04C + 0.50NV Reported: OK

The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1205101 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1205107 - Posted: 12 Mar 2012, 7:41:13 UTC - in response to Message 1205101.  

Yeah, you probably aren't going to get much more than 780mhz without upping the voltage. Mine run at 1.02v stock I believe, at 675mhz. I'm running at 1.10v on water, 1.08v on air. Even the one on air still stays cool, ~55* under full crunch at 850mhz.
ID: 1205107 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65794
Credit: 55,293,173
RAC: 49
United States
Message 1205108 - Posted: 12 Mar 2012, 7:57:53 UTC - in response to Message 1205107.  
Last modified: 12 Mar 2012, 8:11:25 UTC

Yeah, you probably aren't going to get much more than 780mhz without upping the voltage. Mine run at 1.02v stock I believe, at 675mhz. I'm running at 1.10v on water, 1.08v on air. Even the one on air still stays cool, ~55* under full crunch at 850mhz.

Ok, I set the core volts at 1075mV, tried 1080mV and It went back to 1075. The slider goes as far as 1087mV max in Afterburner. Stock is 726MHz core here and lets see the gpu temp is 35C, It was 43C a few seconds back, but it cooled off for some unknown reason, the gpu usage is at 99% of course.

Gpu-z 0.59 says the memory is at 162MHz on the sensors, Yet everything else says 1009 or 2017...
Sensors also say VDDC is 0.9120v, I set the millivolts to max(1087), to no effect.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1205108 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65794
Credit: 55,293,173
RAC: 49
United States
Message 1205110 - Posted: 12 Mar 2012, 8:29:40 UTC
Last modified: 12 Mar 2012, 8:30:15 UTC

Ok I went from 290.53 to 295.73 and the temp shot up to 52-53C from 32C and the memory went from 162MHz to 1009/2017MHz...


Sure I have the power settings set to stay awake and no sleeping, wish I could though, so updating the driver may have fixed this, maybe, time will tell.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1205110 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1205116 - Posted: 12 Mar 2012, 9:30:45 UTC - in response to Message 1204035.  

Bane is a Modern English word meaning "that which causes ruin or woe"


i guess Bane caused his own bane :(


Not to worry, we may be replacing Bane for them if they'd like.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1205116 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1205117 - Posted: 12 Mar 2012, 9:31:37 UTC - in response to Message 1205108.  

Yeah, you probably aren't going to get much more than 780mhz without upping the voltage. Mine run at 1.02v stock I believe, at 675mhz. I'm running at 1.10v on water, 1.08v on air. Even the one on air still stays cool, ~55* under full crunch at 850mhz.

Ok, I set the core volts at 1075mV, tried 1080mV and It went back to 1075. The slider goes as far as 1087mV max in Afterburner. Stock is 726MHz core here and lets see the gpu temp is 35C, It was 43C a few seconds back, but it cooled off for some unknown reason, the gpu usage is at 99% of course.

Gpu-z 0.59 says the memory is at 162MHz on the sensors, Yet everything else says 1009 or 2017...
Sensors also say VDDC is 0.9120v, I set the millivolts to max(1087), to no effect.



Sounds like you had a downclock. Likely it was the reboot and not the driver change that brought the temps, speed, and volts back up. The voltage adjusts in 0.0125 increments, hence why it dropped to 1.075 after setting to 1.08v. You might try backing off the memory clock slightly, say 1900mhz instead of 2000mhz. My boxes are rock solid at 850mhz, 1900mhz, 1.075v. The watercooled 460's run at 888mhz, 1900mhz, and 1.10v.
ID: 1205117 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65794
Credit: 55,293,173
RAC: 49
United States
Message 1205118 - Posted: 12 Mar 2012, 9:39:54 UTC - in response to Message 1205117.  
Last modified: 12 Mar 2012, 9:56:39 UTC

Yeah, you probably aren't going to get much more than 780mhz without upping the voltage. Mine run at 1.02v stock I believe, at 675mhz. I'm running at 1.10v on water, 1.08v on air. Even the one on air still stays cool, ~55* under full crunch at 850mhz.

Ok, I set the core volts at 1075mV, tried 1080mV and It went back to 1075. The slider goes as far as 1087mV max in Afterburner. Stock is 726MHz core here and lets see the gpu temp is 35C, It was 43C a few seconds back, but it cooled off for some unknown reason, the gpu usage is at 99% of course.

Gpu-z 0.59 says the memory is at 162MHz on the sensors, Yet everything else says 1009 or 2017...
Sensors also say VDDC is 0.9120v, I set the millivolts to max(1087), to no effect.



Sounds like you had a downclock. Likely it was the reboot and not the driver change that brought the temps, speed, and volts back up. The voltage adjusts in 0.0125 increments, hence why it dropped to 1.075 after setting to 1.08v. You might try backing off the memory clock slightly, say 1900mhz instead of 2000mhz. My boxes are rock solid at 850mhz, 1900mhz, 1.075v. The watercooled 460's run at 888mhz, 1900mhz, and 1.10v.

Well not long after I tried 295.73 the core/shaders dropped by about half from 780 to 410 or so...

I'm running 275.50 for the moment, sure it's old, I'm trying to avoid the 280 series as they have reboot bsod problems here, I could try 290.36, but not right now, so since I'm where I'm at on core volts @ 1.087v, I upped the core to 800MHz, of course PhysX failed to install with 275.50, but I think I can live with that. Ok, set Mem to 1900MHz. I know how I got to 1.087v, but how in heck did Ya manage 1.10v?

I increased the core to 825MHz, that seems to be holding, so far, but I'm getting sleepy, so I'm heading back to bed...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1205118 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1205148 - Posted: 12 Mar 2012, 12:20:22 UTC - in response to Message 1205069.  

And now we can celebrate that it's only 39,785 AP's out in the wild. Let's see how close to zero we will come before the new AP app is released and we can start splitting AP again.

If it's anything like when we went from the first-generation to _v5, they turned the splitters off for WEEKS to get as close to zero as they could before turning the new version on.

I think that you'll find that they wait until all those v505's are out of the system first before they upgrade to v6 and then they'll fire the splitters back up. ;)

Cheers.


ETA is Tuesday.
I'm not the Pope. I don't speak Ex Cathedra!
ID: 1205148 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14654
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1205189 - Posted: 12 Mar 2012, 17:19:29 UTC - in response to Message 1205179.  

And now we can celebrate that it's only 39,785 AP's out in the wild. Let's see how close to zero we will come before the new AP app is released and we can start splitting AP again.

If it's anything like when we went from the first-generation to _v5, they turned the splitters off for WEEKS to get as close to zero as they could before turning the new version on.

I think that you'll find that they wait until all those v505's are out of the system first before they upgrade to v6 and then they'll fire the splitters back up. ;)

Cheers.


ETA is Tuesday.

Ok, good to hear, but now to the obvious questions for us app_info lunatics:

Does that mean that we will not get any new AP units for our opt apps CPU: ap_5.05r409_SSE.exe and ATI GPU: ap_5.06_win_x86_SSE2_OpenCL_ATI_r521.exe ?

In other words, when our current AP cache is crunched and empty, and the splitters start up again, should we just remove those old apps, because they will never again run any AP's? Are we going to get something in modern times, to replace them? (CPU and GPU?)

End of question time for now.

The developers are working on replacement apps even as we speak. They won't be available instananeously after the stock apps are released - that can't happen until the new installer can be tested on live data and project configuration settings.

But the plan is that there will be a new installer, which will be compatible both with any old v505 work (there are bound to be stray re-issues for weeks, perhaps months, after release) and the new v6 work.

Just don't ask which Tuesday.
ID: 1205189 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 9 · Next

Message boards : Number crunching : Panic Mode On (71) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.