Author | Message |
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
I have been very frustrated with BOINC over the last few weeks and think I have realized the reason why.
It seemed reasonable to me that once I have uploaded my results to the server that I should see the results on my home page at SETI. Unfortunately this is NOT TRUE. Two things have to happen:
1) The results have to upload
2) My computer has to contact a scheduler to someohow tell the serverrs that my results are there
As a result, I thought that I had somehow mis-installed the software and I just kept deinstalling and reinstalling the software.
This is a result of the scheduler being SO, SO difficult to contact. I would look at the server stats page and everything would seem just fine, no backlogs, etc. Unfortunately, I still could not contact a scheduler.
I now believe that we are missing a critical piece of info on the server stats page and it does not accurately reflect the status of the scheduler. WE NEED SOMETHING ON THAT PAGE WHICH INDICATES
WHETHER THE SCEDULER IS DIFFICULT TO CONTACT.
I believe that this would significantly reduce people's reported dissatisfaction with BOIC.
Also, I believe that the BOIC/SETI's staff reluctance to comment on these issues is compounding the problem. When we were running old SETI we would get regular messages about bad performace but now all we get is a deafining silence. This only leads people to believe that the software is terrible and broken when, in fact, what we are seeing is poor performance.
Unless we get either better performance or more sommunication from BOINC/SETI staff BOIC's reputation will continue to suffer until we see an article in the New York TImes "Mass Defections from the World's First Grid Project." That truly would be an unfortunate end for such an importnat project.
Come on guys give us some feedback.
ID: 135145 · |
|
Saenger Volunteer tester
Send message Joined: 3 Apr 99 Posts: 2452 Credit: 33,281 RAC: 0
|
Hello and welcome,
the regular messages are on the technical news page and usually on the home page as well.
If you want more information on the actual status of the different projects, you can use this excellent site. Gruesse vom Saenger
For questions about Boinc look in the BOINC-Wiki
ID: 135164 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Hello and welcome,
the regular messages are on the technical news page and usually on the home page as well.
If you want more information on the actual status of the different projects, you can use this excellent site.
I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?
ID: 135166 · |
|
Saenger Volunteer tester
Send message Joined: 3 Apr 99 Posts: 2452 Credit: 33,281 RAC: 0
|
I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?
I just tried the manual update, and succeeded imediately:
10.07.2005 18:06:56|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
10.07.2005 18:06:56|SETI@home|Requesting 0 seconds of work, returning 0 results
10.07.2005 18:06:58|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
Can you post your messages for any further help? Gruesse vom Saenger
For questions about Boinc look in the BOINC-Wiki
ID: 135168 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Hello and welcome,
the regular messages are on the technical news page and usually on the home page as well.
If you want more information on the actual status of the different projects, you can use this excellent site.
I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?
ID: 135169 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
I just checked and the site says that everythhing is just dandy. Something is wrong because I still can't
contact the scheduler. Judging from the number of complaints I see, I don't believe that I'm the only one.
Somewhere, somehow something is either misleading or broken. How can we solve this problem if we don't hear from the site managers?
I just tried the manual update, and succeeded imediately:
10.07.2005 18:06:56|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
10.07.2005 18:06:56|SETI@home|Requesting 0 seconds of work, returning 0 results
10.07.2005 18:06:58|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
Can you post your messages for any further help?
2005-07-10 11:55:59 [SETI@home] No schedulers responded
2005-07-10 12:04:54 [---] Received signal 15
2005-07-10 12:05:01 [---] Starting BOINC client version 4.43 for powerpc-apple-darwin
2005-07-10 12:05:01 [---] Data directory: /Users/odell/Library/Application Support/Boinc Data
2005-07-10 12:05:01 [SETI@home] Computer ID: 1127764; location: home; project prefs: home
2005-07-10 12:05:01 [---] General prefs: from SETI@home (last modified 2005-06-22 07:58:51)
2005-07-10 12:05:01 [---] General prefs: using separate prefs for home
2005-07-10 12:05:01 [SETI@home] Resuming computation for result 29ap04aa.23242.22242.779842.225_0 using setiathome version 4.18
2005-07-10 12:05:01 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-07-10 12:05:01 [---] schedule_cpus: must schedule
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] Deadline is before reconnect time
2005-07-10 12:05:01 [---] New CPU scheduler policy: earliest deadline first.
2005-07-10 12:05:01 [---] earliest deadline: 1121463390.000000 29ap04aa.23242.22242.779842.225_0
2005-07-10 12:05:09 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 12:05:09 [SETI@home] No schedulers responded
2005-07-10 12:05:09 [SETI@home] Deferring communication with project for 1 hours, 1 minutes, and 18 seconds
I've been getting this kind of messages regularly. Sometimes I can connect but mostly I fail. Are you suggesting that I am the only one experienceing this problem? This list seems rife with such problems.
ID: 135174 · |
|
Thierry Van Driessche Volunteer tester
Send message Joined: 20 Aug 02 Posts: 3083 Credit: 150,096 RAC: 0
|
2005-07-10 12:05:01 [---] General prefs: using separate prefs for home
2005-07-10 12:05:01 [SETI@home] Resuming computation for result 29ap04aa.23242.22242.779842.225_0 using setiathome version 4.18
2005-07-10 12:05:01 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-07-10 12:05:01 [---] schedule_cpus: must schedule
2005-07-10 12:05:09 [SETI@home] No schedulers responded
2005-07-10 12:05:09 [SETI@home] Deferring communication with project for 1 hours, 1 minutes, and 18 seconds
What you could do is following. Go to the command prompt and type following:
tracert setiboinc.ssl.berkeley.edu
The messages you will see should end with something similar to following:
7 21 ms 13 ms 13 ms p14-0.core01.lon02.atlas.cogentco.com [130.117.1.101]
8 102 ms 90 ms 88 ms p10-0.core01.bos01.atlas.cogentco.com [130.117.0.46]
9 120 ms 111 ms 109 ms p5-0.core01.ord01.atlas.cogentco.com [66.28.4.110]
10 178 ms 176 ms 178 ms p5-0.core01.sfo01.atlas.cogentco.com [66.28.4.185]
11 177 ms 166 ms 167 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
12 173 ms 169 ms 177 ms 66.28.250.124
If this is not the case, then you have a connection problem to Berkeley. This can be a modem problem, a ISP problem, a connector problem somewhere between host and ISP or a wiring problem.
Read the News at the homepage
Have a look at the Technical News
Look if your question is not answered at the Boinc Wiki
Best greetings, Thierry
ID: 135180 · |
|
Saenger Volunteer tester
Send message Joined: 3 Apr 99 Posts: 2452 Credit: 33,281 RAC: 0
|
I've been getting this kind of messages regularly. Sometimes I can connect but mostly I fail. Are you suggesting that I am the only one experienceing this problem? This list seems rife with such problems.
No. I'm not, I'm just suggesting that there may be other issues than a broken scheduler involved.
- proxy and firewall will probably be no problem with you, as you already have some WUs. Or have you changed anything in this regard?
- ISP may be, has been before, can't say anything yet. Have you tried a trace of the scheduler?
- Bad luck could be as well, but I doubt that it will struck often enough to be recognized :(
BTW: here's my trace outcome:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:/Dokumente und Einstellungen/Uwe>tracert setiboinc.ssl.berkeley.edu
Routenverfolgung zu setiboinc.ssl.berkeley.edu [66.28.250.124] über maximal 30
Abschnitte:
1 55 ms 55 ms 55 ms 213-182-110-254.teleos-web.de [213.182.110.254]
2 * * * Zeitüberschreitung der Anforderung.
3 86 ms 57 ms 54 ms m10-hf.dts-online.net [212.62.64.30]
4 60 ms 57 ms 57 ms DTS.DO-2-pos130.de.lambdanet.net [217.71.111.29]
5 59 ms 58 ms 60 ms DUS-2-pos210.de.lambdanet.net [217.71.105.57]
6 62 ms 63 ms 63 ms FRA-8-pos000.de.lambdanet.net [217.71.105.46]
7 62 ms 64 ms 62 ms g0-1-243.core01.fra03.atlas.cogentco.com [130.117.0.109]
8 67 ms 65 ms 66 ms p12-0.core01.ams03.atlas.cogentco.com [130.117.0.86]
9 72 ms 74 ms 73 ms p5-0.core01.lon02.atlas.cogentco.com [130.117.1.225]
10 151 ms 150 ms 150 ms p10-0.core01.bos01.atlas.cogentco.com [130.117.0.46]
11 172 ms 172 ms 171 ms p5-0.core01.ord01.atlas.cogentco.com [66.28.4.110]
12 217 ms 216 ms 217 ms p5-0.core01.sfo01.atlas.cogentco.com [66.28.4.185]
13 220 ms 218 ms 219 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
14 218 ms 221 ms 219 ms 66.28.250.124
Ablaufverfolgung beendet.
AFAIK something like step 2 can lead to rejected/incomplete transfers, but not to a 'no response' but I may be wrong. Gruesse vom Saenger
For questions about Boinc look in the BOINC-Wiki
ID: 135183 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
I've been getting this kind of messages regularly. Sometimes I can connect but mostly I fail. Are you suggesting that I am the only one experienceing this problem? This list seems rife with such problems.
No. I'm not, I'm just suggesting that there may be other issues than a broken scheduler involved.
- proxy and firewall will probably be no problem with you, as you already have some WUs. Or have you changed anything in this regard?
- ISP may be, has been before, can't say anything yet. Have you tried a trace of the scheduler?
- Bad luck could be as well, but I doubt that it will struck often enough to be recognized :(
BTW: here's my trace outcome:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:/Dokumente und Einstellungen/Uwe>tracert setiboinc.ssl.berkeley.edu
Routenverfolgung zu setiboinc.ssl.berkeley.edu [66.28.250.124] über maximal 30
Abschnitte:
1 55 ms 55 ms 55 ms 213-182-110-254.teleos-web.de [213.182.110.254]
2 * * * Zeitüberschreitung der Anforderung.
3 86 ms 57 ms 54 ms m10-hf.dts-online.net [212.62.64.30]
4 60 ms 57 ms 57 ms DTS.DO-2-pos130.de.lambdanet.net [217.71.111.29]
5 59 ms 58 ms 60 ms DUS-2-pos210.de.lambdanet.net [217.71.105.57]
6 62 ms 63 ms 63 ms FRA-8-pos000.de.lambdanet.net [217.71.105.46]
7 62 ms 64 ms 62 ms g0-1-243.core01.fra03.atlas.cogentco.com [130.117.0.109]
8 67 ms 65 ms 66 ms p12-0.core01.ams03.atlas.cogentco.com [130.117.0.86]
9 72 ms 74 ms 73 ms p5-0.core01.lon02.atlas.cogentco.com [130.117.1.225]
10 151 ms 150 ms 150 ms p10-0.core01.bos01.atlas.cogentco.com [130.117.0.46]
11 172 ms 172 ms 171 ms p5-0.core01.ord01.atlas.cogentco.com [66.28.4.110]
12 217 ms 216 ms 217 ms p5-0.core01.sfo01.atlas.cogentco.com [66.28.4.185]
13 220 ms 218 ms 219 ms UC-Berkeley.demarc.cogentco.com [66.250.4.74]
14 218 ms 221 ms 219 ms 66.28.250.124
Ablaufverfolgung beendet.
AFAIK something like step 2 can lead to rejected/incomplete transfers, but not to a 'no response' but I may be wrong.
Here is my traceroute information:
Traceroute has started ...
traceroute to setiboinc.ssl.berkeley.edu (66.28.250.124), 64 hops max, 40 byte packets
1 66.102.164.7 (66.102.164.7) 893.696 ms 741.072 ms 799.223 ms
2 66.102.164.3 (66.102.164.3) 728.195 ms 1052.500 ms 737.483 ms
3 66.102.164.33 (66.102.164.33) 767.430 ms 915.520 ms 755.489 ms
4 66.102.163.137 (66.102.163.137) 805.519 ms 548.411 ms 919.615 ms
5 66.102.163.134 (66.102.163.134) 894.530 ms 803.767 ms 875.124 ms
6 66.209.15.129 (66.209.15.129) 575.872 ms 862.352 ms 839.315 ms
7 66.10.13.1 (66.10.13.1) 1016.126 ms 569.340 ms 812.330 ms
8 bb2-g8-0-0.atlnga.sbcglobal.net (66.10.12.242) 831.589 ms 565.963 ms 803.569 ms
9 ex1-p11-2.pxatga.sbcglobal.net (151.164.42.15) 819.979 ms 893.166 ms 813.620 ms
10 151.164.248.182 (151.164.248.182) 1558.032 ms 824.823 ms 787.973 ms
11 p14-0.core01.mco01.atlas.cogentco.com (66.28.4.153) 799.651 ms 1168.278 ms 808.012 ms
12 p14-0.core01.tpa01.atlas.cogentco.com (66.28.4.142) 1163.501 ms 813.614 ms 538.667 ms
13 p5-0.core01.iah01.atlas.cogentco.com (66.28.4.45) 810.063 ms 824.560 ms 782.173 ms
14 p10-0.core01.sjc01.atlas.cogentco.com (66.28.4.238) 829.133 ms 621.404 ms 829.113 ms
15 p4-0.core01.sfo01.atlas.cogentco.com (66.28.4.93) 841.119 ms 1429.698 ms 832.816 ms
16 uc-berkeley.demarc.cogentco.com (66.250.4.74) 1048.960 ms 1081.700 ms 998.110 ms
17 66.28.250.124 (66.28.250.124) 818.155 ms 848.781 ms 853.295 ms
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
ID: 135221 · |
|
Saenger Volunteer tester
Send message Joined: 3 Apr 99 Posts: 2452 Credit: 33,281 RAC: 0
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :( Gruesse vom Saenger
For questions about Boinc look in the BOINC-Wiki
ID: 135223 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
ID: 135225 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
ID: 135253 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
ID: 135283 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
ID: 135383 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Looking in the 500 error code for Apache gives:
10.5 Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.
10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?
ID: 135389 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Looking in the 500 error code for Apache gives:
10.5 Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.
10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?
ID: 135394 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Looking in the 500 error code for Apache gives:
10.5 Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.
10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?
ID: 135397 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Looking in the 500 error code for Apache gives:
10.5 Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.
10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?
ID: 135406 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Looking in the 500 error code for Apache gives:
10.5 Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.
10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?
Well, for no apparent reason the server is not getting an error today and I was able to upload my data. Amazing...
ID: 135608 · |
|
Jim O'Dell
Send message Joined: 27 May 99 Posts: 64 Credit: 6,320,552 RAC: 10
|
Is is possible that the software on my PC (Mac) is giving up to quickly? It seems to timeout with a no response rather quickly.
That's possible, but as I'm on windoze, not on a proper puter ;), I can't tell you. Perhaps you should post it as well over in the Mac part of this forum or in number crunching.
Sorry for not being able to help :(
I still say we need some help from the implementors. Whatever the problem is, I think a lot more people than me have this problem and its giving the project a big, big, black mark.
To follow-up, I downloaded the code for Boinc and have been looking at the HTTP polling code. It looks to me that if the result from the server has not arrived by the time the code gets to the polling routine that the code will fail with a "No schedulers responded" message. Looks to me that the polling code should just return and wait for the data to show up. Its a question of the poll method returning true rather than false.
If I'm correct that would explain the behavior we're all seeing - people with low latency connections would not see a problem and people with higher latency connections would be frustrated.
I sent a note to the BOINC developers list asking if somone could look into this problem in the code.
I have looked into this further and found that I could turn on some debugging info local to my machine.
Here is some output that may be useful. Note the line below that says "Internal Server Error" How can we debug this further?
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req: http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:51 [DEBUG_SCHED_OP ] req:
2005-07-10 21:50:52 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 6597
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-10 21:50:53 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Mon, 11 Jul 2005 01:50:59 GMT
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-10 21:51:05 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-10 21:51:05 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-10 21:51:05 [SETI@home] No schedulers responded
2005-07-10 21:51:06 [SETI@home] Deferring communication with project for 13 minutes and 31 seconds
2005-07-10 21:51:06 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Looking in the 500 error code for Apache gives:
10.5 Server Error 5xx
Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. User agents SHOULD display any included entity to the user. These response codes are applicable to any request method.
10.5.1 500 Internal Server Error
The server encountered an unexpected condition which prevented it from fulfilling the request.
Does this mean that the Server code has a bug or that the messages being sent by the client are incorrect under some circumstances?
Well, for no apparent reason the server is not getting an error today and I was able to upload my data. Amazing...
Another follow-up. I connect my computer to the internet two different ways. One is by a WI-FI connection and the other is via GPRS over a bluetooth phone. In every other application I use the two methods are identical, except for speed. I can read my mail via imap cruise the web etc...
When I connect via WI-FI is seem to be able to connect to a scheduler. When I connect via my phone
over GPRS I get the following:
2005-07-14 19:36:23 [DEBUG_HTTP ] HTTP_OP::init_post(): 0x89008 io_done 0
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): wrote HTTP header to socket 7: 245 bytes
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: POST /sah_cgi/cgi HTTP/1.0
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Pragma: no-cache
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Cache-Control: no-cache
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: User-Agent: BOINC client (powerpc-apple-darwin 4.43)
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Host: setiboinc.ssl.berkeley.edu:80
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Connection: close
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Type: application/octet-stream
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header: Content-Length: 4151
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): request header:
2005-07-14 19:36:24 [DEBUG_HTTP ] HTTP_OP_SET::poll(): finished sending request body
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_OP_SET::poll(): reading reply header; io_ready 1 io_done 0
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: HTTP/1.1 500 Internal Server Error
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Date: Thu, 14 Jul 2005 23:36:26 GMT
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.0
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Connection: close
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header: Content-Type: text/html; charset=iso-8859-1
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::read_reply(): header:
2005-07-14 19:36:28 [DEBUG_HTTP ] HTTP_REPLY_HEADER::parse(): status=500
2005-07-14 19:36:29 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed
2005-07-14 19:36:29 [SETI@home] No schedulers responded
2005-07-14 19:36:29 [SETI@home] Deferring communication with project for 14 minutes and 53 seconds
2005-07-14 19:36:29 [DEBUG_SCHED_OP ] SCHEDULER_OP::poll(): return to idle state
Note the Apache internal server error above. Does anyone have an idea how to debug this further?
Solving this may solve other people's inability to connect. Do BOINC developers monitor these pages?
If not how do we get this kind of information to them. I sent an e-mail to the developer's list but because I'm not a member it was bounced to the moderator for approval. As of yet I have gotten no response.
ID: 136520 · |
|