Panic Mode On (73) Server problems?

Message boards : Number crunching : Panic Mode On (73) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 9 · Next

AuthorMessage
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1215409 - Posted: 8 Apr 2012, 0:31:56 UTC - in response to Message 1215299.  
Last modified: 8 Apr 2012, 0:32:13 UTC

A lightbulb blew earlier, AND I DON'T HAVE A SPARE! :) Other than that I'm still waiting for my astropulse times to get down to what they actually are. At the moment they're at 197hrs :)

It'll get there. You need 10 validated APs that did not early-exit and had less than 10% blanking. Mine took a while for that to happen. Right about 44 total APs to get 10 that met the criteria. Of course by then, the DCF mechanism was just about to provide reasonable rates anyway.

Because of the way DCF works, it took ~20 to go from a little over 200 down to ~175, 15 more to drop down to ~125, and five to get down to ~75. I imagine it would have only taken 5-7 more for the ETA to be pretty close, but I hit that magic 10 and server-side knocked the ETA down to 04:41:47. One task finished after that and now they're all showing ~11.5h like it should be.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1215409 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14010
Credit: 208,696,464
RAC: 304
Australia
Message 1215416 - Posted: 8 Apr 2012, 0:54:10 UTC - in response to Message 1215409.  


Now if they could just get DCF to work for MB on CPU & GPU systems.
Grant
Darwin NT
ID: 1215416 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1215836 - Posted: 8 Apr 2012, 22:00:54 UTC

For the last hour or so, all of my scheduler requests on all 3 rigs are ending in timeout. Cricket and status page both look OK. Uploads going through normally. Anyone else?
ID: 1215836 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1215837 - Posted: 8 Apr 2012, 22:06:10 UTC - in response to Message 1215836.  
Last modified: 8 Apr 2012, 22:06:42 UTC

Yup, same here for past 20 hrs takes 3 or more attempts to get sorted.

Cheers,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1215837 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14010
Credit: 208,696,464
RAC: 304
Australia
Message 1215850 - Posted: 8 Apr 2012, 22:40:44 UTC - in response to Message 1215836.  
Last modified: 8 Apr 2012, 22:46:08 UTC

For the last hour or so, all of my scheduler requests on all 3 rigs are ending in timeout. Cricket and status page both look OK. Uploads going through normally. Anyone else?

Just checked my logs. Been OK (at least on one machine) for the last hour or 2, but for most of the night the majority of Scheduler requests were timing out.


EDIT- just checked my other machine. Still getting the odd Scheduler timeout there. Looks like it's just the luck of the draw. You might get a Scheduler timeout. Even if that doesn't happen you might get a "Project has no tasks available" message. If you're really lucky you might get some work.
Grant
Darwin NT
ID: 1215850 · Report as offensive
Profile Britz

Send message
Joined: 1 Apr 12
Posts: 1
Credit: 267,233
RAC: 0
United States
Message 1215910 - Posted: 9 Apr 2012, 0:47:09 UTC

I got about 16 GPU and 1 AP WUs today, and that was around 3:00 EDT. Took a long while for the GPU WUs to get returned. Hope it all gets sorted out soon.
ID: 1215910 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1215956 - Posted: 9 Apr 2012, 3:15:06 UTC

My apologies for clogging the pipe, but my 10-day AP-only cache is now full. 3 crunching with 63 "ready to start."

What were the limits again? Wasn't it 50 per [physical] CPU, or did it go back to counting cores? I know for a while it didn't matter if it was 1-16 cores, it was 1 physical CPU and there was a limit.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1215956 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1215978 - Posted: 9 Apr 2012, 3:42:01 UTC

I don't know if this is related to server problems or what, but my i7's tasks in progress that was at its maximum of 800 earlier today is now 699.

This may or may not have anything to do with the fact that I finally pulled the trigger and installed Lunatics around noon today.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1215978 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1215992 - Posted: 9 Apr 2012, 4:37:28 UTC - in response to Message 1215978.  

I don't know if this is related to server problems or what, but my i7's tasks in progress that was at its maximum of 800 earlier today is now 699.

This may or may not have anything to do with the fact that I finally pulled the trigger and installed Lunatics around noon today.

My tasks in progress continues to drop, and the server has been saying I've reached my limit for about 7 hours now. Is this anything to do with the switch to Lunatics? Boinc is running GPU work on high priority, even though it's not due for a week or more. (If it matters, I set app_info to do 2 GPUs at a time.)

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1215992 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1216001 - Posted: 9 Apr 2012, 5:03:30 UTC - in response to Message 1215992.  

I don't know if this is related to server problems or what, but my i7's tasks in progress that was at its maximum of 800 earlier today is now 699.

This may or may not have anything to do with the fact that I finally pulled the trigger and installed Lunatics around noon today.

My tasks in progress continues to drop, and the server has been saying I've reached my limit for about 7 hours now. Is this anything to do with the switch to Lunatics? Boinc is running GPU work on high priority, even though it's not due for a week or more. (If it matters, I set app_info to do 2 GPUs at a time.)


I guess the limit was reached only on CPU tasks, if the GPU's are in panic mode, BOINC wont ask for more GPU work, and if you are not using the flops tags then the scheduller is still learning what the speed for the new app's is and meanwhile is using a (slower) default value which is forcing the panic mode...
(or something like that...)
ID: 1216001 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1216014 - Posted: 9 Apr 2012, 5:42:09 UTC

Scheduler still seems borked.

I've got over 300 to report, and I'm guessing about 100 or more ghosted, waiting for a good scheduler response or five to start receiving them as resends.

Still getting 99% timeouts.
ID: 1216014 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14010
Credit: 208,696,464
RAC: 304
Australia
Message 1216023 - Posted: 9 Apr 2012, 5:59:35 UTC - in response to Message 1216014.  


Most of today the Scheduler has been hit & miss, but for about the last hour it's all been miss. Every attempt has timed out.
Grant
Darwin NT
ID: 1216023 · Report as offensive
AndrewM
Volunteer tester

Send message
Joined: 5 Jan 08
Posts: 369
Credit: 34,275,196
RAC: 0
Australia
Message 1216031 - Posted: 9 Apr 2012, 6:31:46 UTC

To borrow a line from Wiggo, I'm bouncing off the limits
AndrewM
ID: 1216031 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14010
Credit: 208,696,464
RAC: 304
Australia
Message 1216033 - Posted: 9 Apr 2012, 7:04:07 UTC - in response to Message 1216031.  

To borrow a line from Wiggo, I'm bouncing off the limits

So am i, now.
But for several hours there i was getting further & further from the limits with each Scheduler timeout.
Grant
Darwin NT
ID: 1216033 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22920
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1216036 - Posted: 9 Apr 2012, 7:31:53 UTC

The schedulers must have had a surfeit of Chocolate Easter Eggs and dozed off in the arm chair...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1216036 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1216057 - Posted: 9 Apr 2012, 9:15:06 UTC

every holiday the servers seem to take their own vacation. And who says machines don't have feelings.... pffff

on topic : agreed with previous posters, time-out reached on every attempt.
ID: 1216057 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 38589
Credit: 261,360,520
RAC: 489
Australia
Message 1216060 - Posted: 9 Apr 2012, 9:20:30 UTC - in response to Message 1216033.  

To borrow a line from Wiggo, I'm bouncing off the limits

So am i, now.
But for several hours there i was getting further & further from the limits with each Scheduler timeout.

Far too much bouncing going here now, I'm getting a sore neck.

Cheers.
ID: 1216060 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1216135 - Posted: 9 Apr 2012, 14:59:14 UTC - in response to Message 1216001.  

I don't know if this is related to server problems or what, but my i7's tasks in progress that was at its maximum of 800 earlier today is now 699.

This may or may not have anything to do with the fact that I finally pulled the trigger and installed Lunatics around noon today.

My tasks in progress continues to drop, and the server has been saying I've reached my limit for about 7 hours now. Is this anything to do with the switch to Lunatics? Boinc is running GPU work on high priority, even though it's not due for a week or more. (If it matters, I set app_info to do 2 GPUs at a time.)


I guess the limit was reached only on CPU tasks, if the GPU's are in panic mode, BOINC wont ask for more GPU work, and if you are not using the flops tags then the scheduller is still learning what the speed for the new app's is and meanwhile is using a (slower) default value which is forcing the panic mode...
(or something like that...)

In addition, the tasks being crunched were sent to be done by the stock applications, so the servers won't yet be applying the run times to anonymous platform averages.

Without flops in app_info.xml the core client will be using the ~3.34e09 Whetstone value as <flops> for CPU work, and ~1.8e09 (0.54*Whetstones) as <flops> for GPU. The CPU value is low by a factor of 6 or more, the GPU value is low by about a factor of 48 (rough estimates based on APRs for stock). DCF will have been driven down so the estimated run times are not longer by that much, in fact CPU tasks are likely to have short estimates. But the GPU task estimates will still be long, hence the high priority processing.
                                                                  Joe
ID: 1216135 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1216152 - Posted: 9 Apr 2012, 15:22:38 UTC - in response to Message 1216135.  

My whole project is acting a bit weird. Scheduler timeouts, downloads that won't move but suddenly scream through the pipes (100kbs on occasion).


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1216152 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1216153 - Posted: 9 Apr 2012, 15:28:53 UTC - in response to Message 1216135.  

I don't know if this is related to server problems or what, but my i7's tasks in progress that was at its maximum of 800 earlier today is now 699.

This may or may not have anything to do with the fact that I finally pulled the trigger and installed Lunatics around noon today.

My tasks in progress continues to drop, and the server has been saying I've reached my limit for about 7 hours now. Is this anything to do with the switch to Lunatics? Boinc is running GPU work on high priority, even though it's not due for a week or more. (If it matters, I set app_info to do 2 GPUs at a time.)

I guess the limit was reached only on CPU tasks, if the GPU's are in panic mode, BOINC wont ask for more GPU work, and if you are not using the flops tags then the scheduller is still learning what the speed for the new app's is and meanwhile is using a (slower) default value which is forcing the panic mode...
(or something like that...)

I suspected something like that. However, it has downloaded new work for both CPU and GPU under Anonymous Platform. (Hasn't returned any of it yet; still working on the previous stuff, but returning it in a day instead of the usual 5-6 days.)

In addition, the tasks being crunched were sent to be done by the stock applications, so the servers won't yet be applying the run times to anonymous platform averages.

Without flops in app_info.xml the core client will be using the ~3.34e09 Whetstone value as <flops> for CPU work, and ~1.8e09 (0.54*Whetstones) as <flops> for GPU. The CPU value is low by a factor of 6 or more, the GPU value is low by about a factor of 48 (rough estimates based on APRs for stock). DCF will have been driven down so the estimated run times are not longer by that much, in fact CPU tasks are likely to have short estimates. But the GPU task estimates will still be long, hence the high priority processing.
                                                                  Joe

I suspected that too. (The basics; my eyes glazed over on the details.)

Tasks in progress is currently sitting at 659. Oddly, it hasn't made contact in almost 4 hours. I'll have to remote in and see what's up when I switch to my work laptop (right now I can't because I'm on my work desktop and IE crashes when I use logmein on it).

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1216153 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 9 · Next

Message boards : Number crunching : Panic Mode On (73) Server problems?


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.