Unable to contact scheduler but web page shows everything up

Questions and Answers : Web site : Unable to contact scheduler but web page shows everything up
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135145 - Posted: 10 Jul 2005, 15:22:31 UTC

I have been very frustrated with BOINC over the last few weeks and think I have realized the reason why.

It seemed reasonable to me that once I have uploaded my results to the server that I should see the results on my home page at SETI. Unfortunately this is NOT TRUE. Two things have to happen:

1) The results have to upload
2) My computer has to contact a scheduler to someohow tell the serverrs that my results are there

As a result, I thought that I had somehow mis-installed the software and I just kept deinstalling and reinstalling the software.

This is a result of the scheduler being SO, SO difficult to contact. I would look at the server stats page and everything would seem just fine, no backlogs, etc. Unfortunately, I still could not contact a scheduler.

I now believe that we are missing a critical piece of info on the server stats page and it does not accurately reflect the status of the scheduler. WE NEED SOMETHING ON THAT PAGE WHICH INDICATES
WHETHER THE SCEDULER IS DIFFICULT TO CONTACT.

I believe that this would significantly reduce people's reported dissatisfaction with BOIC.

Also, I believe that the BOIC/SETI's staff reluctance to comment on these issues is compounding the problem. When we were running old SETI we would get regular messages about bad performace but now all we get is a deafining silence. This only leads people to believe that the software is terrible and broken when, in fact, what we are seeing is poor performance.

Unless we get either better performance or more sommunication from BOINC/SETI staff BOIC's reputation will continue to suffer until we see an article in the New York TImes "Mass Defections from the World's First Grid Project." That truly would be an unfortunate end for such an importnat project.

Come on guys give us some feedback.

ID: 135145 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2452
Credit: 33,281
RAC: 0
Germany
Message 135164 - Posted: 10 Jul 2005, 15:58:00 UTC
Last modified: 10 Jul 2005, 15:58:18 UTC

Hello and welcome,

the regular messages are on the technical news page and usually on the home page as well.

If you want more information on the actual status of the different projects, you can use this excellent site.
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 135164 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135166 - Posted: 10 Jul 2005, 16:03:03 UTC - in response to Message 135164.  

Hello and welcome,

the regular messages are on the technical news page and usually on the home page as well.

If you want more information on the actual status of the different projects, you can use this excellent site.


I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?


ID: 135166 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2452
Credit: 33,281
RAC: 0
Germany
Message 135168 - Posted: 10 Jul 2005, 16:05:49 UTC - in response to Message 135166.  

I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?



I just tried the manual update, and succeeded imediately:

10.07.2005 18:06:56|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
10.07.2005 18:06:56|SETI@home|Requesting 0 seconds of work, returning 0 results
10.07.2005 18:06:58|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded



Can you post your messages for any further help?
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 135168 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135169 - Posted: 10 Jul 2005, 16:05:51 UTC - in response to Message 135164.  

Hello and welcome,

the regular messages are on the technical news page and usually on the home page as well.

If you want more information on the actual status of the different projects, you can use this excellent site.


I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?


ID: 135169 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135174 - Posted: 10 Jul 2005, 16:11:38 UTC - in response to Message 135168.  

I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?



I just tried the manual update, and succeeded imediately:

10.07.2005 18:06:56|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
10.07.2005 18:06:56|SETI@home|Requesting 0 seconds of work, returning 0 results
10.07.2005 18:06:58|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded


Can you post your messages for any further help?


2005-07-10 11:55:59 [SETI@home] No schedulers responded
2005-07-10 12:04:54 [---] Received signal 15
2005-07-10 12:05:01 [---] Starting BOINC client version 4.43 for powerpc-apple-darwin
2005-07-10 12:05:01 [---] Data directory: /Users/odell/Library/Application Support/Boinc Data
2005-07-10 12:05:01 [SETI@home] Computer ID: 1127764; location: home; project prefs: home
2005-07-10 12:05:01 [---] General prefs: from SETI@home (last modified 2005-06-22 07:58:51)
2005-07-10 12:05:01 [---] General prefs: using separate prefs for home
2005-07-10 12:05:01 [SETI@home] Resuming computation for result 29ap04aa.23242.22242.779842.225_0 using setiathome version 4.18
2005-07-10 12:05:01 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-07-10 12:05:01 [---] schedule_cpus: must schedule
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] New CPU scheduler policy: earliest deadline first.
2005-07-10 12:05:01 [---] earliest deadline: 1121463390.000000 29ap04aa.23242.22242.779842.225_0
2005-07-10 12:05:09 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 12:05:09 [SETI@home] No schedulers responded
2005-07-10 12:05:09 [SETI@home] Deferring communication with project for 1 hours, 1 minutes, and 18 seconds

I've been getting this kind of messages regularly. Sometimes I can connect but mostly I fail. Are you suggesting that I am the only one experienceing this problem? This list seems rife with such problems.

ID: 135174 · Report as offensive
Profile Thierry Van Driessche
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3083
Credit: 150,096
RAC: 0
Belgium
Message 135180 - Posted: 10 Jul 2005, 16:28:23 UTC - in response to Message 135174.  
Last modified: 10 Jul 2005, 16:31:04 UTC

2005-07-10 12:05:01 [---] General prefs: using separate prefs for home
2005-07-10 12:05:01 [SETI@home] Resuming computation for result 29ap04aa.23242.22242.779842.225_0 using setiathome version 4.18
2005-07-10 12:05:01 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-07-10 12:05:01 [---] schedule_cpus: must schedule

2005-07-10 12:05:09 [SETI@home] No schedulers responded
2005-07-10 12:05:09 [SETI@home] Deferring communication with project for 1 hours, 1 minutes, and 18 seconds

What you could do is following. Go to the command prompt and type following:

tracert setiboinc.ssl.berkeley.edu

The messages you will see should end with something similar to following:

7 21 ms 13 ms 13 ms p14-0.core01.lon02.atlas.cogentco.com [130.117.1.101]
8 102 ms 90 ms 88 ms p10-0.core01.bos01.atlas.cogentco.com [130.117.0.46]
9 120 ms 111 ms 109 ms p5-0.core01.ord01.atlas.cogentco.com [66.28.4.110]
10 178 ms 176 ms 178 ms p5-0.core01.sfo01.atlas.cogentco.com [66.28.4.185]
11 177 ms 166 ms 167 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
12 173 ms 169 ms 177 ms 66.28.250.124

If this is not the case, then you have a connection problem to Berkeley. This can be a modem problem, a ISP problem, a connector problem somewhere between host and ISP or a wiring problem.
Read the News at the homepage
Have a look at the Technical News
Look if your question is not answered at the Boinc Wiki

Best greetings, Thierry
ID: 135180 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2452
Credit: 33,281
RAC: 0
Germany
Message 135183 - Posted: 10 Jul 2005, 16:30:21 UTC - in response to Message 135174.  
Last modified: 10 Jul 2005, 16:51:45 UTC

I've been getting this kind of messages regularly. Sometimes I can connect but mostly I fail. Are you suggesting that I am the only one experienceing this problem? This list seems rife with such problems.


No. I'm not, I'm just suggesting that there may be other issues than a broken scheduler involved.
- proxy and firewall will probably be no problem with you, as you already have some WUs. Or have you changed anything in this regard?
- ISP may be, has been before, can't say anything yet. Have you tried a trace of the scheduler?
- Bad luck could be as well, but I doubt that it will struck often enough to be recognized :(

BTW: here's my trace outcome:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:/Dokumente und Einstellungen/Uwe>tracert setiboinc.ssl.berkeley.edu

Routenverfolgung zu setiboinc.ssl.berkeley.edu [66.28.250.124] über maximal 30
Abschnitte:

1 55 ms 55 ms 55 ms 213-182-110-254.teleos-web.de [213.182.110.254]

2 * * * Zeitüberschreitung der Anforderung.
3 86 ms 57 ms 54 ms m10-hf.dts-online.net [212.62.64.30]
4 60 ms 57 ms 57 ms DTS.DO-2-pos130.de.lambdanet.net [217.71.111.29]
5 59 ms 58 ms 60 ms DUS-2-pos210.de.lambdanet.net [217.71.105.57]
6 62 ms 63 ms 63 ms FRA-8-pos000.de.lambdanet.net [217.71.105.46]
7 62 ms 64 ms 62 ms g0-1-243.core01.fra03.atlas.cogentco.com [130.117.0.109]
8 67 ms 65 ms 66 ms p12-0.core01.ams03.atlas.cogentco.com [130.117.0.86]
9 72 ms 74 ms 73 ms p5-0.core01.lon02.atlas.cogentco.com [130.117.1.225]
10 151 ms 150 ms 150 ms p10-0.core01.bos01.atlas.cogentco.com [130.117.0.46]
11 172 ms 172 ms 171 ms p5-0.core01.ord01.atlas.cogentco.com [66.28.4.110]
12 217 ms 216 ms 217 ms p5-0.core01.sfo01.atlas.cogentco.com [66.28.4.185]
13 220 ms 218 ms 219 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
14 218 ms 221 ms 219 ms 66.28.250.124

Ablaufverfolgung beendet.


AFAIK something like step 2 can lead to rejected/incomplete transfers, but not to a 'no response' but I may be wrong.
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 135183 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135221 - Posted: 10 Jul 2005, 18:35:19 UTC - in response to Message 135183.  

I've been getting this kind of messages regularly. Sometimes I can connect but mostly I fail. Are you suggesting that I am the only one experienceing this problem? This list seems rife with such problems.


No. I'm not, I'm just suggesting that there may be other issues than a broken scheduler involved.
- proxy and firewall will probably be no problem with you, as you already have some WUs. Or have you changed anything in this regard?
- ISP may be, has been before, can't say anything yet. Have you tried a trace of the scheduler?
- Bad luck could be as well, but I doubt that it will struck often enough to be recognized :(

BTW: here's my trace outcome:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:/Dokumente und Einstellungen/Uwe>tracert setiboinc.ssl.berkeley.edu

Routenverfolgung zu setiboinc.ssl.berkeley.edu [66.28.250.124] über maximal 30
Abschnitte:

1 55 ms 55 ms 55 ms 213-182-110-254.teleos-web.de [213.182.110.254]

2 * * * Zeitüberschreitung der Anforderung.
3 86 ms 57 ms 54 ms m10-hf.dts-online.net [212.62.64.30]
4 60 ms 57 ms 57 ms DTS.DO-2-pos130.de.lambdanet.net [217.71.111.29]
5 59 ms 58 ms 60 ms DUS-2-pos210.de.lambdanet.net [217.71.105.57]
6 62 ms 63 ms 63 ms FRA-8-pos000.de.lambdanet.net [217.71.105.46]
7 62 ms 64 ms 62 ms g0-1-243.core01.fra03.atlas.cogentco.com [130.117.0.109]
8 67 ms 65 ms 66 ms p12-0.core01.ams03.atlas.cogentco.com [130.117.0.86]
9 72 ms 74 ms 73 ms p5-0.core01.lon02.atlas.cogentco.com [130.117.1.225]
10 151 ms 150 ms 150 ms p10-0.core01.bos01.atlas.cogentco.com [130.117.0.46]
11 172 ms 172 ms 171 ms p5-0.core01.ord01.atlas.cogentco.com [66.28.4.110]
12 217 ms 216 ms 217 ms p5-0.core01.sfo01.atlas.cogentco.com [66.28.4.185]
13 220 ms 218 ms 219 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
14 218 ms 221 ms 219 ms 66.28.250.124

Ablaufverfolgung beendet.


AFAIK something like step 2 can lead to rejected/incomplete transfers, but not to a 'no response' but I may be wrong.



Here is my traceroute information:

Traceroute has started ...

traceroute to setiboinc.ssl.berkeley.edu (66.28.250.124), 64 hops max, 40 byte packets
1 66.102.164.7 (66.102.164.7) 893.696 ms 741.072 ms 799.223 ms
2 66.102.164.3 (66.102.164.3) 728.195 ms 1052.500 ms 737.483 ms
3 66.102.164.33 (66.102.164.33) 767.430 ms 915.520 ms 755.489 ms
4 66.102.163.137 (66.102.163.137) 805.519 ms 548.411 ms 919.615 ms
5 66.102.163.134 (66.102.163.134) 894.530 ms 803.767 ms 875.124 ms
6 66.209.15.129 (66.209.15.129) 575.872 ms 862.352 ms 839.315 ms
7 66.10.13.1 (66.10.13.1) 1016.126 ms 569.340 ms 812.330 ms
8 bb2-g8-0-0.atlnga.sbcglobal.net (66.10.12.242) 831.589 ms 565.963 ms 803.569 ms
9 ex1-p11-2.pxatga.sbcglobal.net (151.164.42.15) 819.979 ms 893.166 ms 813.620 ms
10 151.164.248.182 (151.164.248.182) 1558.032 ms 824.823 ms 787.973 ms
11 p14-0.core01.mco01.atlas.cogentco.com (66.28.4.153) 799.651 ms 1168.278 ms 808.012 ms
12 p14-0.core01.tpa01.atlas.cogentco.com (66.28.4.142) 1163.501 ms 813.614 ms 538.667 ms
13 p5-0.core01.iah01.atlas.cogentco.com (66.28.4.45) 810.063 ms 824.560 ms 782.173 ms
14 p10-0.core01.sjc01.atlas.cogentco.com (66.28.4.238) 829.133 ms 621.404 ms 829.113 ms
15 p4-0.core01.sfo01.atlas.cogentco.com (66.28.4.93) 841.119 ms 1429.698 ms 832.816 ms
16 uc-berkeley.demarc.cogentco.com (66.250.4.74) 1048.960 ms 1081.700 ms 998.110 ms
17 66.28.250.124 (66.28.250.124) 818.155 ms 848.781 ms 853.295 ms


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.


ID: 135221 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2452
Credit: 33,281
RAC: 0
Germany
Message 135223 - Posted: 10 Jul 2005, 18:41:27 UTC - in response to Message 135221.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 135223 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135225 - Posted: 10 Jul 2005, 18:46:24 UTC - in response to Message 135223.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.

ID: 135225 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135253 - Posted: 10 Jul 2005, 19:51:02 UTC - in response to Message 135225.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
ID: 135253 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135283 - Posted: 10 Jul 2005, 20:39:20 UTC - in response to Message 135253.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.


ID: 135283 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135383 - Posted: 11 Jul 2005, 2:06:59 UTC - in response to Message 135283.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state


ID: 135383 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135389 - Posted: 11 Jul 2005, 2:13:37 UTC - in response to Message 135383.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state



Looking in the 500 error code for Apache gives:

10.5 Server Error 5xx

Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?


ID: 135389 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135394 - Posted: 11 Jul 2005, 2:23:21 UTC - in response to Message 135383.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state



Looking in the 500 error code for Apache gives:

10.5 Server Error 5xx

Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?


ID: 135394 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135397 - Posted: 11 Jul 2005, 2:29:07 UTC - in response to Message 135383.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state



Looking in the 500 error code for Apache gives:

10.5 Server Error 5xx

Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?


ID: 135397 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135406 - Posted: 11 Jul 2005, 2:37:57 UTC - in response to Message 135383.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state



Looking in the 500 error code for Apache gives:

10.5 Server Error 5xx

Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?


ID: 135406 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 135608 - Posted: 11 Jul 2005, 16:27:16 UTC - in response to Message 135406.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state



Looking in the 500 error code for Apache gives:

10.5 Server Error 5xx

Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?




Well, for no apparent reason the server is not getting an error today and I was able to upload my data. Amazing...
ID: 135608 · Report as offensive
Jim O'Dell

Send message
Joined: 27 May 99
Posts: 64
Credit: 6,320,552
RAC: 10
United States
Message 136520 - Posted: 14 Jul 2005, 23:46:38 UTC - in response to Message 135608.  


Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.



That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.

Sorry for not being able to help :(



I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.


To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.

If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.



I sent a note to the BOINC developers list asking if somone could look into this problem in the code.




I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?


2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:

2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state



Looking in the 500 error code for Apache gives:

10.5 Server Error 5xx

Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.

10.5.1 500 Internal Server Error

The server encountered an unexpected condition which prevented it from fulfilling the request.

Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?




Well, for no apparent reason the server is not getting an error today and I was able to upload my data. Amazing...



Another follow-up. I connect my computer to the internet two different ways. One is by a WI-FI connection and the other is via GPRS over a bluetooth phone. In every other application I use the two methods are identical, except for speed. I can read my mail via imap cruise the web etc...

When I connect via WI-FI is seem to be able to connect to a scheduler. When I connect via my phone
over GPRS I get the following:


2005-07-14 19:36:23 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 4151
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Thu, 14 Jul 2005 23:36:26 GMT
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-14 19:36:29 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-14 19:36:29 [SETI@home] No schedulers responded
2005-07-14 19:36:29 [SETI@home] Deferring communication with project for 14 minutes and 53 seconds
2005-07-14 19:36:29 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state

Note the Apache internal server error above. Does anyone have an idea how to debug this further?
Solving this may solve other people's inability to connect. Do BOINC developers monitor these pages?
If not how do we get this kind of information to them. I sent an e-mail to the developer's list but because I'm not a member it was bounced to the moderator for approval. As of yet I have gotten no response.

ID: 136520 · Report as offensive
1 · 2 · Next

Questions and Answers : Web site : Unable to contact scheduler but web page shows everything up


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.