Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 24 · Next

AuthorMessage
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1323743 - Posted: 2 Jan 2013, 20:18:07 UTC - in response to Message 1323736.  

From a quick glance over the the above posts am I correct in assuming the situation can be summarized as follows:

There are plenty of work units to send, but they are not getting sent. (For whatever reason.)


The situation is normal after a period of machines running down their caches. Thousands of machines all requesting all the work they can at the same time.

Looking at my list of all tasks I have been allocated around 150 tasks across my machines in the past 15 minutes. Doing a quick check looks like they are all downloaded without any ghosts being made in the process.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1323743 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1323746 - Posted: 2 Jan 2013, 20:24:39 UTC - in response to Message 1323736.  

From a quick glance over the the above posts am I correct in assuming the situation can be summarized as follows:

There are plenty of work units to send, but they are not getting sent. (For whatever reason.)

The number of ready to send has dropped by over 40,000.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1323746 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1323749 - Posted: 2 Jan 2013, 20:29:38 UTC - in response to Message 1323723.  


Even if I query from the same server every 15 seconds I can not get any tasks. I never get the message that the request was too recent.

Anyone have an explanation or what I need to change?

The work requests are fulfilled by the feeder. Which only holds a few hundred at a time. It also only refills every so few seconds. Originally I think it was 100 tasks every 60 seconds, but it has been estimated that it has to be much higher and faster these days.


I have over 150 machines that have not got any tasks today and they are getting tasks from Einstein which has a resource of 0.

Any idea why polling less than 303 seconds does not return the normal error about the request being too recent?
ID: 1323749 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323750 - Posted: 2 Jan 2013, 20:34:48 UTC - in response to Message 1323746.  

From a quick glance over the the above posts am I correct in assuming the situation can be summarized as follows:

There are plenty of work units to send, but they are not getting sent. (For whatever reason.)

The number of ready to send has dropped by over 40,000.

Yay! My out-of-work machine got one new GPU task!
ID: 1323750 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1323752 - Posted: 2 Jan 2013, 20:37:15 UTC - in response to Message 1323749.  


Even if I query from the same server every 15 seconds I can not get any tasks. I never get the message that the request was too recent.

Anyone have an explanation or what I need to change?

The work requests are fulfilled by the feeder. Which only holds a few hundred at a time. It also only refills every so few seconds. Originally I think it was 100 tasks every 60 seconds, but it has been estimated that it has to be much higher and faster these days.


I have over 150 machines that have not got any tasks today and they are getting tasks from Einstein which has a resource of 0.

Any idea why polling less than 303 seconds does not return the normal error about the request being too recent?

They must have switched that off, perhaps temporally. I am not getting that from BETA either.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1323752 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323755 - Posted: 2 Jan 2013, 20:42:10 UTC - in response to Message 1323750.  


Yay! My out-of-work machine got one new GPU task!

...which took 7'34" to complete...

ID: 1323755 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1323757 - Posted: 2 Jan 2013, 20:44:32 UTC - in response to Message 1323752.  

Okay Thanks.

One of my machines finally got a task. Too many machines with empty caches fighting at the same time.

Guess I will just have to find some alcohol to get some patience, :^}
ID: 1323757 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1323763 - Posted: 2 Jan 2013, 20:56:04 UTC - in response to Message 1323749.  


I have over 150 machines that have not got any tasks today and they are getting tasks from Einstein which has a resource of 0.

Any idea why polling less than 303 seconds does not return the normal error about the request being too recent?


Um.. How do you fit all 150 machines at home? Or did you hijack a corporate server farm? How do you finance the powerbill for those 150 machines?

With a moderate assumption of 150W/machine would mean 22,5kW. At a 0,1€/kWh this would cost you 2,25€/hour -> 2,25€*24 = 54€/day -> 365* 54€/day = 19710€/year to run.

With the power youre spending you could trade in your car for a new one every year.

ID: 1323763 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1323765 - Posted: 2 Jan 2013, 21:02:00 UTC - in response to Message 1323763.  


I have over 150 machines that have not got any tasks today and they are getting tasks from Einstein which has a resource of 0.

Any idea why polling less than 303 seconds does not return the normal error about the request being too recent?


Um.. How do you fit all 150 machines at home? Or did you hijack a corporate server farm? How do you finance the powerbill for those 150 machines?

With a moderate assumption of 150W/machine would mean 22,5kW. At a 0,1€/kWh this would cost you 2,25€/hour -> 2,25€*24 = 54€/day -> 365* 54€/day = 19710€/year to run.

With the power youre spending you could trade in your car for a new one every year.

IIRC he owns a server farm. Which he runs projects on when he doesn't have it leased out.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1323765 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323768 - Posted: 2 Jan 2013, 21:07:19 UTC - in response to Message 1323757.  

Guess I will just have to find some alcohol to get some patience, :^}

Yep, I know that feelin'... :-0!
ID: 1323768 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1323773 - Posted: 2 Jan 2013, 21:22:28 UTC - in response to Message 1323763.  

LOL
Yes these are servers at my work that I can use when we are not doing their "real work". I use them in between test cycles. They are HP c7000 Blade Chassis mounted 4 to a rack and the company was actually experimenting with special cooling systems for the equipment racks that the company was going to sell. There are 3 160KW cooling systems in our lab. The cooling system alone could handle several large homes. The company has since abandoned the cooling project.

ID: 1323773 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1323789 - Posted: 2 Jan 2013, 21:38:23 UTC - in response to Message 1323773.  

LOL
Yes these are servers at my work that I can use when we are not doing their "real work". I use them in between test cycles. They are HP c7000 Blade Chassis mounted 4 to a rack and the company was actually experimenting with special cooling systems for the equipment racks that the company was going to sell. There are 3 160KW cooling systems in our lab. The cooling system alone could handle several large homes. The company has since abandoned the cooling project.

I'm working on a project that's tooling up to use carbon-dioxide mixed-phase cooling in its system (particle detectors, not CPUs) but I'd love to commercialise it for processor cooling. Probably too expensive for home computing (a hefty compressor is needed to liquify the CO2) but maybe for server farms?
ID: 1323789 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 1323866 - Posted: 2 Jan 2013, 22:18:51 UTC

Greetings,

Dang near 5,000,000 WUs between MB and AstroPulse and my BOINC client event log says project has no new tasks?

What gives?

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1323866 · Report as offensive
Profile Qui-Gon
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 2940
Credit: 19,199,902
RAC: 11
United States
Message 1323926 - Posted: 2 Jan 2013, 23:21:49 UTC - in response to Message 1323866.  

Greetings,

Dang near 5,000,000 WUs between MB and AstroPulse and my BOINC client event log says project has no new tasks?

What gives?

Keep on BOINCing...! :)

Same thing happening to me. I took a break from crunching over the holidays so my hoppers are empty. Now that we're back up again, I can't get any work? That is disappointing.

PS: Happy New Year Siran
ID: 1323926 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9959
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1323928 - Posted: 2 Jan 2013, 23:33:00 UTC

Just got 24 CPU tasks when I want GPU, when I looked all 24 are VLARS!!!


ID: 1323928 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1323952 - Posted: 3 Jan 2013, 0:43:24 UTC - in response to Message 1323926.  
Last modified: 3 Jan 2013, 1:02:40 UTC

Greetings,

Dang near 5,000,000 WUs between MB and AstroPulse and my BOINC client event log says project has no new tasks?

What gives?

Keep on BOINCing...! :)

Same thing happening to me. I took a break from crunching over the holidays so my hoppers are empty. Now that we're back up again, I can't get any work? That is disappointing.

PS: Happy New Year Siran

It appears that there are so many hosts with empty caches all trying to get tasks at the same time that the system is having troubles getting the tasks delivered. Tasks are coming out very slowly. I have gotten a few hundred between all my hosts today. We just need to remain patient while the servers get caught up.

Enjoy a mug of your favorite beverage and have a happy new year.
ID: 1323952 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1323966 - Posted: 3 Jan 2013, 1:32:00 UTC
Last modified: 3 Jan 2013, 1:42:03 UTC

The cricket graph is well short of max,
somthing iz still broken.
the work cant get to the pipe to go slowly through it.
There shurly cant be that many VLARs not going to GPUs ??!!
I would realy like to get back to crunching VLARs on my ATI GPU but Eric an Co wont let me, [/whinge]
ID: 1323966 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13946
Credit: 208,696,464
RAC: 304
Australia
Message 1323991 - Posted: 3 Jan 2013, 3:47:43 UTC - in response to Message 1323966.  
Last modified: 3 Jan 2013, 3:49:47 UTC

somthing iz still broken.

Yep.
If there are that many WUs ready to send, and everything was working as it should then the traffic should be maxed out. It's not, and most requests for work result in the No tasks available message.


EDIT- one of my systems just scored 38 WUs for the GPU. They're not shorties, so it should last for a few hours before they're all done.
Grant
Darwin NT
ID: 1323991 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1323995 - Posted: 3 Jan 2013, 4:08:15 UTC - in response to Message 1323991.  

I just got 44 GPU work units - unfortunately all shorties.

ID: 1323995 · Report as offensive
Profile Peter M. Ferrie
Volunteer tester

Send message
Joined: 28 Mar 03
Posts: 86
Credit: 9,967,062
RAC: 0
United States
Message 1323998 - Posted: 3 Jan 2013, 4:27:58 UTC

if there really are 5,000,000 + workunits ready to send ... it will be the great wu flood of 2013 .. head for higher cpu's
ID: 1323998 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.