No schedulars responded.

Message boards : Number crunching : No schedulars responded.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next

AuthorMessage
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 125194 - Posted: 19 Jun 2005, 4:00:28 UTC

From the Technical News Page:
June 16, 2005 - 19:00 UTC
Since most (well over 99%) of scheduler accesses were now reaching the new scheduling server, we shut down the scheduler on the old server (which now only handles uploads/downloads).

I am suspecting these people that are having problems are maybe still hitting the old server, and have not updated to hit the new server.

I have had no problems, so far this weekend. Plenty of work to go around, it seems.



My movie https://vimeo.com/manage/videos/502242
ID: 125194 · Report as offensive
Profile Steve Cressman
Volunteer tester
Avatar

Send message
Joined: 6 Jun 02
Posts: 583
Credit: 65,644
RAC: 0
Canada
Message 125223 - Posted: 19 Jun 2005, 4:46:07 UTC - in response to Message 124860.  
Last modified: 19 Jun 2005, 5:12:34 UTC

I know about the planned outages, but this has gone on for me for 12 hours now and others I have asked on some irc channels are not having a problem. So why would I consistently be getting the dreaded 'no schedular responded' message for the last 12 hrs. Web access is OK and the computer has been re-booted a few times today due to various updates.

18/06/2005 9:09:52 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
18/06/2005 9:09:52 PM|SETI@home|Requesting 0 seconds of work, returning 11 results
18/06/2005 9:09:56 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
18/06/2005 9:09:56 PM|SETI@home|No schedulers responded
18/06/2005 9:09:57 PM|SETI@home|Deferring communication with project for 59 seconds

What's interesting in all this is that every request has been deferred for 59 seconds and is not backing off.

Live long and crunch.


I was having the same problem, deferring for 58 sec everytime. After several hours of this I tried using a proxy to attach, had success and work dl/ul. Turned off the proxy and the deferring for 58 sec problem returned. Initiated use of the proxy again and now it works again.

>I am suspecting these people that are having problems are maybe still hitting
>the old server, and have not updated to hit the new server.

This seems highly likely and was wondering what to do to have the dns update. (me thinks dumping the dns cache in my firewall should do it.)

EDIT : Just removed the entries for berkley instead of deleting whole cache and will see what happens with next conection.
CONT: That didn't work but using proxy does solve it for now.

98SE XP2500+ @ 2.1 GHz Boinc v5.8.8

And God said"Let there be light."But then the program crashed because he was trying to access the 'light' property of a NULL universe pointer.
ID: 125223 · Report as offensive
EclipseHA

Send message
Joined: 28 Jul 99
Posts: 1018
Credit: 530,719
RAC: 0
United States
Message 125240 - Posted: 19 Jun 2005, 5:24:27 UTC - in response to Message 125194.  

From the Technical News Page:
June 16, 2005 - 19:00 UTC
Since most (well over 99%) of scheduler accesses were now reaching the new scheduling server, we shut down the scheduler on the old server (which now only handles uploads/downloads).



Let's see 99% of 100000 means that 1000 people are still using the old server!

Keeping the old server active until it was 99.9 or even 99.999% might have been the correct move.

"hey, we get less than one scheduler hit an hour on the old server, ok to shut it down?" might have been a more valid guage. The better test would have been only a handful of hits to the old server in a day....

Some ISP's set their DNS/Bind to always cache for X amount of time, reguardless of what was specified in the original zone file... Ya got to give a DNS change like this about a week to fully ripple thru the internet... Trust me.. I've seen this many times before.....

Saying that "99%" is the mark" just leads to frustation for, in this example, 1000 people. Rebooting their PC won't clear the problem if their DNS is back at their ISP (as it would be is the case for most users, or proxy sites)
ID: 125240 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 125293 - Posted: 19 Jun 2005, 9:25:28 UTC - in response to Message 125098.  
Last modified: 19 Jun 2005, 9:25:58 UTC

I don't want to reset with 26 results ready to report.

As said earlier....uploads work a treat. Just can't connect to a schedular. Going out for a while, so I will check it out later and maybe downgrade my boinc client later and see if that works out.

Paul.


Well it updated ok while I was out and downloaded 14 wu's, but it's back to no schedulars responded now with 5 results ready to report. It has now stopped the 58 second deferral and is doing exponential backoffs, with it now at 1hr11min defferal.

Odd really.

Paul.
ID: 125293 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 125304 - Posted: 19 Jun 2005, 11:07:32 UTC - in response to Message 125240.  
Last modified: 19 Jun 2005, 11:09:32 UTC

Let's see 99% of 100000 means that 1000 people are still using the old server!

Keeping the old server active until it was 99.9 or even 99.999% might have been the correct move.

"hey, we get less than one scheduler hit an hour on the old server, ok to shut it down?" might have been a more valid guage. The better test would have been only a handful of hits to the old server in a day....

Some ISP's set their DNS/Bind to always cache for X amount of time, reguardless of what was specified in the original zone file... Ya got to give a DNS change like this about a week to fully ripple thru the internet... Trust me.. I've seen this many times before.....

Saying that "99%" is the mark" just leads to frustation for, in this example, 1000 people. Rebooting their PC won't clear the problem if their DNS is back at their ISP (as it would be is the case for most users, or proxy sites)


If a project uses 2 scheduling-servers, one of them is handing out even-result-id while the other is handing out odd result-id. In instances like now one of the scheduling-servers only gets 1% of the traffic, this means very few wu can be validated since for most wu only 2 results have been sent out. This also means very little credit is handed out, splitters must work much harder generating enough wu, and disk-usage will increase very fast since no old wu can be deleted.

BTW, if you tries to connect to the old scheduling-server, you'll get this message:
<scheduler_reply>
<message priority="low">Project is temporarily shut down for maintenance</message>
<request_delay>3600</request_delay>
<project_is_down/>
</scheduler_reply>
Content-type: text/plain
ID: 125304 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 125351 - Posted: 19 Jun 2005, 14:21:09 UTC - in response to Message 125223.  

[qoute]
I was having the same problem, deferring for 58 sec everytime. After several hours of this I tried using a proxy to attach, had success and work dl/ul. Turned off the proxy and the deferring for 58 sec problem returned. Initiated use of the proxy again and now it works again.

...
EDIT : Just removed the entries for berkley instead of deleting whole cache and will see what happens with next conection.
CONT: That didn't work but using proxy does solve it for now.
[/quote]

Hi Steve,

which proxy did you use? I tried several around the world but these did not change my situation. Still no response from the scheduler. The last two wus of my 6 day cache are on now and the deadline is eight days away. So there is a chance to figure this out in time. 39 wus to report.

_\|/_
U r s
ID: 125351 · Report as offensive
Profile Thierry Van Driessche
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3083
Credit: 150,096
RAC: 0
Belgium
Message 125353 - Posted: 19 Jun 2005, 14:27:12 UTC - in response to Message 125351.  


which proxy did you use? I tried several around the world but these did not change my situation. Still no response from the scheduler. The last two wus of my 6 day cache are on now and the deadline is eight days away. So there is a chance to figure this out in time. 39 wus to report.

A list of proxies that worked some 2 months ago can be find here.
ID: 125353 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 125355 - Posted: 19 Jun 2005, 14:37:25 UTC - in response to Message 125353.  
Last modified: 19 Jun 2005, 14:52:56 UTC


which proxy did you use? I tried several around the world but these did not change my situation. Still no response from the scheduler. The last two wus of my 6 day cache are on now and the deadline is eight days away. So there is a chance to figure this out in time. 39 wus to report.

A list of proxies that worked some 2 months ago can be find here.


Thanks Thierry.

This is the list i have tried today and two other sides lists of free proxies, too. Maybe a proxy will not solve this. My hope is, the machine will run out of work and then be able to connect. Approximately two hours till then.

edit: Before i forgot that two machines are unable to report/download wus. So its 59 wus to report. My third machine just was able to connect via the same dial-up connection. So DNS seems not to be the problem.


_\|/_
U r s
ID: 125355 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 125369 - Posted: 19 Jun 2005, 15:30:12 UTC

Something positiv:

After the last wu was half through, the connection happend. Now it is downloading new work. I reduced the cache-size to 4 days. Hopefully this works better.


_\|/_
U r s
ID: 125369 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 125486 - Posted: 19 Jun 2005, 23:58:01 UTC - in response to Message 125293.  

I don't want to reset with 26 results ready to report.

As said earlier....uploads work a treat. Just can't connect to a schedular. Going out for a while, so I will check it out later and maybe downgrade my boinc client later and see if that works out.

Paul.


Well it updated ok while I was out and downloaded 14 wu's, but it's back to no schedulars responded now with 5 results ready to report. It has now stopped the 58 second deferral and is doing exponential backoffs, with it now at 1hr11min defferal.

Odd really.

Paul.


After hitting update a few times, BOINC went into the 58 second deferral and stayed that way for another 10hrs when it was able to report and get get new work again. Is there something that Berkeley have implemented that they haven't told us yet? I've never had this sort of behaviour from BOINC before.

Interestinger and interestinger....

Paul
(S@H1 8888)
And proud of it!
ID: 125486 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 125527 - Posted: 20 Jun 2005, 3:00:02 UTC

The address of the Scheduler(s) are stored in the Master File. The master File is checked after 10 contact failures. It should be self correcting, and if you don't have the patience, you can just hit update enough times.


BOINC WIKI
ID: 125527 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 125530 - Posted: 20 Jun 2005, 3:17:32 UTC - in response to Message 125527.  
Last modified: 20 Jun 2005, 3:18:31 UTC

The address of the Scheduler(s) are stored in the Master File. The master File is checked after 10 contact failures. It should be self correcting, and if you don't have the patience, you can just hit update enough times.


Let me see....10hrs of

20/06/2005 9:20:42 AM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
20/06/2005 9:20:42 AM|SETI@home|Requesting 0 seconds of work, returning 6 results
20/06/2005 9:20:45 AM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
20/06/2005 9:20:45 AM|SETI@home|No schedulers responded
20/06/2005 9:20:46 AM|SETI@home|Deferring communication with project for 58 seconds

equals just a few more than 10 contact failures.....

As it is said, "Patience is a virue". I am trying to be very virtuous when it comes to BOINC atm.

What's interesting in all this is that BOINC will eventually make contact, report work and get new work then start repeating the above for another 10hrs.

Paul.
ID: 125530 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 125620 - Posted: 20 Jun 2005, 10:53:08 UTC

Still at it. Other projects are working OK, so it's not my ISP, it's not my connection, it has to be something at Berkeley...anyone?

Paul.
ID: 125620 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 125623 - Posted: 20 Jun 2005, 11:03:44 UTC - in response to Message 125620.  

Still at it. Other projects are working OK, so it's not my ISP, it's not my connection, it has to be something at Berkeley...anyone?

No problem here- 3 seconds from request to response from the Scheduler at the moment.
Grant
Darwin NT
ID: 125623 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 126401 - Posted: 22 Jun 2005, 12:51:52 UTC - in response to Message 125620.  
Last modified: 22 Jun 2005, 12:54:37 UTC

Still at it. Other projects are working OK, so it's not my ISP, it's not my connection, it has to be something at Berkeley...anyone?

Paul.

Still at it..about to run out of work. I am not having this problem on my laptop which also uses the same connection from home to get work on occasion.

After hitting update many times I have seen a new message....

22/06/2005 10:48:56 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
22/06/2005 10:48:56 PM|SETI@home|Requesting 518400 seconds of work, returning 13 results
22/06/2005 10:49:00 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
22/06/2005 10:49:00 PM|SETI@home|No schedulers responded
22/06/2005 10:49:01 PM|SETI@home|Deferring communication with project for 59 seconds
22/06/2005 10:50:01 PM|SETI@home|Fetching master file
22/06/2005 10:50:05 PM|SETI@home|Master page download succeeded
22/06/2005 10:50:06 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
22/06/2005 10:50:06 PM|SETI@home|Requesting 518400 seconds of work, returning 13 results
22/06/2005 10:50:10 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
22/06/2005 10:50:10 PM|SETI@home|No schedulers responded
22/06/2005 10:50:11 PM|SETI@home|Deferring communication with project for 59 seconds

So a new master file has been retrieved which must have the required url in it.

I am now out of seti work :(

Why is it so....

Paul.
ID: 126401 · Report as offensive
Alite

Send message
Joined: 1 Sep 99
Posts: 3
Credit: 2,807,174
RAC: 0
Netherlands
Message 126408 - Posted: 22 Jun 2005, 13:22:52 UTC

On most of my machines I can contact the Scheduler but don't receive any new work. Just these messages...

6/22/2005 7:59:41 AM|SETI@home|Sending request to scheduler: http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
6/22/2005 7:59:42 AM|SETI@home|Scheduler RPC to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded

Any ideas?
Arie
ID: 126408 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 126566 - Posted: 23 Jun 2005, 0:18:35 UTC

I've joined you into the deferal of communication, Paul. Even a project reset doesn't help. Only thing different so far is that I only had 4 of the 58 seconds deferals, before I jumped to 38 minutes. ;)

23/06/2005 02:13:58|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
23/06/2005 02:13:58|SETI@home|Requesting 43200 seconds of work, returning 0 results
23/06/2005 02:13:59|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
23/06/2005 02:13:59|SETI@home|No schedulers responded
23/06/2005 02:14:00|SETI@home|Deferring communication with project for 38 minutes and 20 seconds

About to put the thing to NNW again and just leave it be. Beta can then do the work. :)
ID: 126566 · Report as offensive
Profile mikey
Volunteer tester
Avatar

Send message
Joined: 17 Dec 99
Posts: 4215
Credit: 3,474,603
RAC: 0
United States
Message 126570 - Posted: 23 Jun 2005, 0:25:24 UTC - in response to Message 126566.  

I've joined you into the deferal of communication, Paul. Even a project reset doesn't help. Only thing different so far is that I only had 4 of the 58 seconds deferals, before I jumped to 38 minutes. ;)

23/06/2005 02:13:58|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
23/06/2005 02:13:58|SETI@home|Requesting 43200 seconds of work, returning 0 results
23/06/2005 02:13:59|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
23/06/2005 02:13:59|SETI@home|No schedulers responded
23/06/2005 02:14:00|SETI@home|Deferring communication with project for 38 minutes and 20 seconds

About to put the thing to NNW again and just leave it be. Beta can then do the work. :)

If you look at the Cogent connection:
http://fragment1.berkeley.edu/~cricket/inr-668-interfaces.html
it shows Berkeley is busy doing something!

ID: 126570 · Report as offensive
Profile Daniel Michel
Volunteer tester
Avatar

Send message
Joined: 2 Feb 04
Posts: 14925
Credit: 1,378,607
RAC: 6
United States
Message 126572 - Posted: 23 Jun 2005, 0:29:22 UTC

i think the traffic jam is being caused by the new seti 4.18 application being downloaded to seti users all over the planet...both of my machines have now recieved it...but it took some time.

PROUD TO BE TFFE!
ID: 126572 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 126582 - Posted: 23 Jun 2005, 1:10:09 UTC

Yup, it was the traffic jam that caused it for me. ;)

I've now got 4.18, one unit and hot damn, I got graphics. :)
ID: 126582 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next

Message boards : Number crunching : No schedulars responded.


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.