Running out of work

Message boards : Number crunching : Running out of work
Message board moderation

To post messages, you must log in.

AuthorMessage
NexusNet
Avatar

Send message
Joined: 20 Sep 05
Posts: 9
Credit: 149,033
RAC: 0
United States
Message 222527 - Posted: 29 Dec 2005, 0:45:01 UTC

Are others seeing "no work" messages in response to work requests? Or 12/28/2005 7:32:35 PM||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu] messages? I didn't see similar posts in my quick skim of thread titles. Wanted to see what others were experiencing before looking further on my end, as otherwise my network and servers seem to be normal.

Thanks,

Robert


ID: 222527 · Report as offensive
Thorz

Send message
Joined: 15 May 05
Posts: 14
Credit: 175,347
RAC: 0
United States
Message 222531 - Posted: 29 Dec 2005, 0:49:40 UTC

yup, I have been getting that also
ID: 222531 · Report as offensive
Profile Landroval

Send message
Joined: 7 Oct 01
Posts: 188
Credit: 2,098,881
RAC: 1
United States
Message 222540 - Posted: 29 Dec 2005, 1:05:20 UTC - in response to Message 222527.  

Are others seeing "no work" messages in response to work requests? Or 12/28/2005 7:32:35 PM||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu] messages? I didn't see similar posts in my quick skim of thread titles. Wanted to see what others were experiencing before looking further on my end, as otherwise my network and servers seem to be normal.

There's an outage every Wednesday while maintenence is done on the database. And there's always a congested period for a few hours after it comes back up as the accumulated work tries to get done. So some "can't connect" message during those periods aren't unusual. If it's still going on several hours from now or tomorrow, it's worth troubleshooting, but otherwise it's probably outage-related.

Cheers,
Brian

If you think education is expensive, try ignorance.
ID: 222540 · Report as offensive
SteveSueTyler

Send message
Joined: 12 Feb 01
Posts: 6
Credit: 2,291,762
RAC: 11
United States
Message 222542 - Posted: 29 Dec 2005, 1:07:59 UTC - in response to Message 222540.  

Are others seeing "no work" messages in response to work requests? Or 12/28/2005 7:32:35 PM||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu] messages? I didn't see similar posts in my quick skim of thread titles. Wanted to see what others were experiencing before looking further on my end, as otherwise my network and servers seem to be normal.

There's an outage every Wednesday while maintenence is done on the database. And there's always a congested period for a few hours after it comes back up as the accumulated work tries to get done. So some "can't connect" message during those periods aren't unusual. If it's still going on several hours from now or tomorrow, it's worth troubleshooting, but otherwise it's probably outage-related.

Cheers,
Brian


ID: 222542 · Report as offensive
SteveSueTyler

Send message
Joined: 12 Feb 01
Posts: 6
Credit: 2,291,762
RAC: 11
United States
Message 222545 - Posted: 29 Dec 2005, 1:12:19 UTC

I am getting the same message

12/28/2005 7:38:44 PM||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu]
12/28/2005 7:38:45 PM|SETI@home|Temporarily failed download of 18fe05aa.28993.19794.273584.1.116: system I/O

ID: 222545 · Report as offensive
Profile Dr. Bob
Avatar

Send message
Joined: 1 Apr 03
Posts: 78
Credit: 623,977
RAC: 0
United States
Message 222551 - Posted: 29 Dec 2005, 1:32:57 UTC - in response to Message 222527.  

Are others seeing "no work" messages in response to work requests? Or 12/28/2005 7:32:35 PM||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu] messages? I didn't see similar posts in my quick skim of thread titles. Wanted to see what others were experiencing before looking further on my end, as otherwise my network and servers seem to be normal.

Thanks,

Robert


================
I have same problem; some machines on network have just stopped d/l work. All except my macintosh...just keeps crunching and u/l and d/l units.

Dr. Bob

Robert L. Hanson, Ed.D.
ID: 222551 · Report as offensive
Jack Gulley

Send message
Joined: 4 Mar 03
Posts: 423
Credit: 526,566
RAC: 0
United States
Message 222556 - Posted: 29 Dec 2005, 1:54:48 UTC - in response to Message 222540.  

There's an outage every Wednesday while maintenance is done on the database. And there's always a congested period for a few hours after it comes back up as the accumulated work tries to get done. So some "can't connect" message during those periods aren't unusual. If it's still going on several hours from now or tomorrow, it's worth troubleshooting, but otherwise it's probably outage-related.

That may not be true in this case. The recovery from the outage went outstanding well for the first two hours. My machines had no trouble at all uploading results and reporting results. A few even got validated quickly.

There was the usual "Started", Counldn't connect", "Temporarily failed" messages for less than an hour, but that cleared up with download only taking 10 seconds for the next hour.

The problems seemed to start about two hours into the recovery when my machines started coming out of backoff on requesting new work. The first requests all resulted in "No work from project". But that was not unusual because the number of seconds of work being requested was less than the time it takes those machines to do a WU. Then ten minutes later the second round of requests for work with seconds larger than the run time also got "No work from project"?

This was followed by a long series of "Started download of #", followed about 18 seconds latter by a "Couldn't connect to hostname" error and a "Temporarily failed download of #". After about 20 minutes of this, the results downloaded.

The Cricket graphs show there was a drop in download traffic during this 20 minute period. The project may have ran out of work, but the Server status page did not indicate such a problem.

So there was some sort of "glitch" in the process causing a "No work from project". It could have been something the staff at Berkeley was doing or the servers could have been backlogged and not able to move work from the "Ready to Send" queue. But that seems to have cleared now.
ID: 222556 · Report as offensive
Profile Landroval

Send message
Joined: 7 Oct 01
Posts: 188
Credit: 2,098,881
RAC: 1
United States
Message 222560 - Posted: 29 Dec 2005, 2:00:20 UTC - in response to Message 222556.  

That may not be true in this case. The recovery from the outage went outstanding well for the first two hours. My machines had no trouble at all uploading results and reporting results. A few even got validated quickly.
<snip>
The Cricket graphs show there was a drop in download traffic during this 20 minute period. The project may have ran out of work, but the Server status page did not indicate such a problem.

So there was some sort of "glitch" in the process causing a "No work from project". It could have been something the staff at Berkeley was doing or the servers could have been backlogged and not able to move work from the "Ready to Send" queue. But that seems to have cleared now.

If it's cleared, then no problem, these sorts of transient incidents are fairly common. The regular outage will produce the "can't connect" messages, but with the other messages that were posted ("system I/O"), there's obviously something else going on. Nothing major, let's hope. If it continues or recurs, posting the error messages (and a few lines of the log before the error message) can help troubleshoot what's happening.

Happy crunching!
If you think education is expensive, try ignorance.
ID: 222560 · Report as offensive

Message boards : Number crunching : Running out of work


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.