Panic Mode On (12) Server problems

Message boards : Number crunching : Panic Mode On (12) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

AuthorMessage
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 860478 - Posted: 1 Feb 2009, 1:31:38 UTC - in response to Message 860474.  

I have not seen any slow downs. I seem to be able to get right to a site. If your on a cable modem maybe every kid in your town is online. As far as work units im trying to whittle my massive One day cache down to a one day cache. wont need any new work for at least 3 days.http://www.allprojectstats.com/sig793359x-6.png
[/quote]

Old James
ID: 860478 · Report as offensive
Profile littlegreenmanfrommars
Volunteer tester
Avatar

Send message
Joined: 28 Jan 06
Posts: 1410
Credit: 934,158
RAC: 0
Australia
Message 861025 - Posted: 2 Feb 2009, 7:52:41 UTC

Getting a few of these, today.
Happening on different machines, at different locations, connected to the net by different methods:

2/02/2009 6:37:45 PM|SETI@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 12 completed tasks
2/02/2009 6:38:07 PM||Project communication failed: attempting access to reference site
2/02/2009 6:38:10 PM||Internet access OK - project servers may be temporarily down.

Looks like the validator again.
ID: 861025 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 861643 - Posted: 3 Feb 2009, 15:55:21 UTC

OK -- it is Tuesday morning, time to prepare for the weekly twelve hour communications outage (4 hours maintenance, 8 hour post maintenance data storm).

My approach is to push results in prior to the outage, then suspend SETI and let other projects take up the slack until late Tuesday night.

ID: 861643 · Report as offensive
Profile Jack Zhang
Volunteer tester
Avatar

Send message
Joined: 2 Jul 06
Posts: 206
Credit: 6,142,449
RAC: 0
Canada
Message 861746 - Posted: 4 Feb 2009, 2:42:40 UTC - in response to Message 860461.  

No such problems here, but getting a lot of "SETI@home|Message from server: (Project has no jobs available)" over the last couple of days.


Am crunching so fast that I get those all the time right now...
What if Fiction was Fact and Fact was Fiction and vice versa?
ID: 861746 · Report as offensive
Profile littlegreenmanfrommars
Volunteer tester
Avatar

Send message
Joined: 28 Jan 06
Posts: 1410
Credit: 934,158
RAC: 0
Australia
Message 861853 - Posted: 4 Feb 2009, 11:46:34 UTC - in response to Message 861746.  

Then S-L-O-W D-O-W-N!!!

lol
ID: 861853 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 864795 - Posted: 12 Feb 2009, 22:10:17 UTC


Uhh.. I think it's time again for this one..



All people around, press this button to feel better..! ;-D
I pressed him few times.. many times.. hope he's not broken now.. ;-D

ID: 864795 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 865012 - Posted: 13 Feb 2009, 13:07:58 UTC - in response to Message 864795.  


Uhh.. I think it's time again for this one..



All people around, press this button to feel better..! ;-D
I pressed him few times.. many times.. hope he's not broken now.. ;-D

I'm not going into panic mode yet, but I'm not getting any work either :-)

No multibeam, no astropulse, no cuda, absolutely nothing for the last 10 hours or so. Eventhough the server status page claims that 50-60 thousand multibeam units are available. Apparently that doesn't mean much ...

Regards,
John.
ID: 865012 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 865021 - Posted: 13 Feb 2009, 13:33:56 UTC - in response to Message 865012.  


I'm not going into panic mode yet, but I'm not getting any work either :-)

No multibeam, no astropulse, no cuda, absolutely nothing for the last 10 hours or so. Eventhough the server status page claims that 50-60 thousand multibeam units are available. Apparently that doesn't mean much ...

Regards,
John.

I went 24 hours with "No work is available" messages. Eventually tried the old "ipconfig /flushdns" and that sorted things immediately...

F.
ID: 865021 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 865032 - Posted: 13 Feb 2009, 14:24:01 UTC - in response to Message 865021.  


I'm not going into panic mode yet, but I'm not getting any work either :-)

No multibeam, no astropulse, no cuda, absolutely nothing for the last 10 hours or so. Eventhough the server status page claims that 50-60 thousand multibeam units are available. Apparently that doesn't mean much ...

Regards,
John.

I went 24 hours with "No work is available" messages. Eventually tried the old "ipconfig /flushdns" and that sorted things immediately...

F.

Hello Fred,

After reading your message I tried that one. It did something, all right. I received a grand total of 5 wu's over 4 machines (2 got some, 2 didn't). I guess it's not just asking the wrong server for work, there just isn't much work available as well. Well, I'll try again in a couple of hours. Thanks for the suggestion anyway!

Regards,
John.
ID: 865032 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 865035 - Posted: 13 Feb 2009, 14:30:53 UTC - in response to Message 865032.  

Hello Fred,

After reading your message I tried that one. It did something, all right. I received a grand total of 5 wu's over 4 machines (2 got some, 2 didn't). I guess it's not just asking the wrong server for work, there just isn't much work available as well. Well, I'll try again in a couple of hours. Thanks for the suggestion anyway!

Regards,
John.

Glad it helped. Must admit that I did it just before leaving for work this morning, saw that it had triggered some download, and assumed that had fixed it sufficiently as the 10 WU's I saw it get together with the Beta's that were already in the cache would keep the cores warm until I get home. I have noted that it hasn't contacted the server since I left it...

F.
ID: 865035 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 865038 - Posted: 13 Feb 2009, 14:33:11 UTC - in response to Message 865021.  


I'm not going into panic mode yet, but I'm not getting any work either :-)

No multibeam, no astropulse, no cuda, absolutely nothing for the last 10 hours or so. Eventhough the server status page claims that 50-60 thousand multibeam units are available. Apparently that doesn't mean much ...

Regards,
John.

I went 24 hours with "No work is available" messages. Eventually tried the old "ipconfig /flushdns" and that sorted things immediately...

F.


Can someone explain how to do this 'ipconfig/flushdns' please.
ID: 865038 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 865039 - Posted: 13 Feb 2009, 14:36:03 UTC - in response to Message 865038.  
Last modified: 13 Feb 2009, 14:37:21 UTC


I'm not going into panic mode yet, but I'm not getting any work either :-)

No multibeam, no astropulse, no cuda, absolutely nothing for the last 10 hours or so. Eventhough the server status page claims that 50-60 thousand multibeam units are available. Apparently that doesn't mean much ...

Regards,
John.

I went 24 hours with "No work is available" messages. Eventually tried the old "ipconfig /flushdns" and that sorted things immediately...

F.


Can someone explain how to do this 'ipconfig/flushdns' please.

Open a "Command" window (Start > Run > cmd). Type ipconfig /flushdns into the box and press "Enter". Close the Command window.

F.
[edit] Note the <space> before the / character [/edit]
ID: 865039 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14674
Credit: 200,643,578
RAC: 874
United Kingdom
Message 865042 - Posted: 13 Feb 2009, 14:45:53 UTC

That explains the 'How', but not the 'Why'.

flushdns can only help if you're encountering an 'IP' (Internet Protocol) communications failure. It's been helpful in the past when the scheduler has allocated work, but you get IP errors when you try to download it.

This is different. You're communicating with the servers - they even sent you a message back. OK, it wasn't the message you wanted to see, but at the communication level, it worked.

So flushdns didn't actually help. It may have made you feel better, like banging your head against a brick wall can make you feel better, but it didn't actually cause Berkeley to create new work just for you. That was a coincidence.
ID: 865042 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 865067 - Posted: 13 Feb 2009, 16:10:45 UTC - in response to Message 865042.  

That explains the 'How', but not the 'Why'.

flushdns can only help if you're encountering an 'IP' (Internet Protocol) communications failure. It's been helpful in the past when the scheduler has allocated work, but you get IP errors when you try to download it.

This is different. You're communicating with the servers - they even sent you a message back. OK, it wasn't the message you wanted to see, but at the communication level, it worked.

So flushdns didn't actually help. It may have made you feel better, like banging your head against a brick wall can make you feel better, but it didn't actually cause Berkeley to create new work just for you. That was a coincidence.

Richard,
This was *some* coincidence. As I said, I'd gone over 24 hours with "No work available" coming back to every work request (to Main - Beta downloads were fine). As soon as I did the flushdns Main downloaded WU's. Other things I had tried were bounce Boinc, re-boot machine, replace Boinc 6.6.3 with 6.6.4 (with 6.4.5 BM). It was my, perhaps naive, assumption that the flushdns command caused the DNS server (which may have been locked onto a particular one of the IP addresses associated with the URL) to be refreshed and, thus, pick up a working connection.

I recognise that I have not explained this very well, the complexities of IP routing, load-balancing servers etc don't lend themselves to a short post, but I hope you get my drift.

Then again - perhaps you are right and it was just coincidence; but it worked for me and seems to have worked for john deneer??

F.
ID: 865067 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14674
Credit: 200,643,578
RAC: 874
United Kingdom
Message 865070 - Posted: 13 Feb 2009, 16:21:26 UTC - in response to Message 865067.  

The point you have to bear in mind there is that there is only one scheduling server (anakin), and you seem to have been talking to it: it told you there was no work available. No scope for a dns problem there.

There are indeed two download servers (bane and vader): once you have been allocated work, either of them will do for handling the download. If one is playing up, then letting dns look again in the hope of finding the other one is a good move.

But I still can't see any "cause and effect" for dns to influence a working scheduler communication.
ID: 865070 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 865080 - Posted: 13 Feb 2009, 16:38:48 UTC - in response to Message 865070.  

But I still can't see any "cause and effect" for dns to influence a working scheduler communication.

Well, neither can I, but I don't know all details involved in Berkeley's setup. If your 'assumption' that only 1 server is involved is correct, then it would indeed be strange that clearing the dns cache would have any effect. But maybe they changed that, and there are now 2 or more addresses involved? I only know for sure that I don't know.

I tend to agree with Fred, because I had been trying to get work repeatedly, even by 'updating' manually through boincmanager, and got nothing. As soon as I /flushdns'ed (new English word :-) I got work on 2 of my machines. As Fred said: some coincidence!

However, I have to admit that those machines haven't gotten anything since. So that's one point in your favour. Let's call it a draw until maybe we get some more info :-)

Regards,
John.
ID: 865080 · Report as offensive
Hans Kramer
Volunteer tester

Send message
Joined: 16 May 99
Posts: 61
Credit: 8,770,184
RAC: 0
Netherlands
Message 865081 - Posted: 13 Feb 2009, 16:43:19 UTC
Last modified: 13 Feb 2009, 16:45:12 UTC

Tried the DNS flush, no-go for me. Still getting the no jobs line. Didn't really expect it to because of what Richard said, DNS doesn't seem to (shouldn't) be the problem.

My guess it's a server side problem, not being able to keep up with the number of requests. BTW this has nothing to with the work available or bandwith, cricket seems fine.
ID: 865081 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 865083 - Posted: 13 Feb 2009, 16:49:15 UTC - in response to Message 865081.  

Tried the DNS flush, no-go for me. Still getting the no jobs line. Didn't really expect it to because of what Richard said, DNS doesn't seem to (shouldn't) be the problem.

My guess it's a server side problem, not being able to keep up with the number of requests. BTW this has nothing to with the work available or bandwith, cricket seems fine.

After my last message one of my machines spontaneously started downloading a couple of wu's .... must have been reading along as I typed :-)

Funny thing is that it first got served the 'no work' message and then got a couple of units up its harddisk. Twice in a row .... I've seen that happen before, but I guess it's consistent with your diagnosis of a server side problem. Inconsistent, unpredictable, stubborn. That server must be female :-)

Regards,
John.
ID: 865083 · Report as offensive
Hans Kramer
Volunteer tester

Send message
Joined: 16 May 99
Posts: 61
Credit: 8,770,184
RAC: 0
Netherlands
Message 865165 - Posted: 13 Feb 2009, 20:49:20 UTC - in response to Message 865083.  

Noticed something on the server page. It seems "Results ready to send" is varying between ~60k and ~120k. Seems a bit strange, you'd expect some variation but not this much. Don't know if this is a result or a reason or not related at all, just kinda odd.
ID: 865165 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 865170 - Posted: 13 Feb 2009, 21:33:05 UTC
Last modified: 13 Feb 2009, 21:55:44 UTC

Soon I will run out or work.. ~ 8 hours..

I pressed ~ 10 times the update button and get ~ 2 WUs..

I don't have time to do this all the time.. ;-D

Hmm.. soon he will idle and then I think I need to switch OFF.. :-(


What is the prob?
And when the prob will be solved?


EDIT:
My current rig need ~ 370 WUs/day to make the room warm.. ;-)
I don't let run SETI@home because of credits.. it's because of the science..
ID: 865170 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (12) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.