Panic Mode On (39) Server problems

Author	Message
Jim_S Send message Joined: 23 Feb 00 Posts: 4705 Credit: 64,560,357 RAC: 31	Message 1038248 - Posted: 2 Oct 2010, 10:42:14 UTC I only have one (Working)box that has not reported yet. But she's trying. I Desire Peace and Justice, Jim Scott (Mod-Ret.) ID: 1038248 ·

perryjay Volunteer tester Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0	Message 1038286 - Posted: 2 Oct 2010, 13:05:18 UTC Vic, I'm not going to try to find the thread but Matt apologized about those VLARs. Seems a whole herd of them accidentally got released into the wild. The boys got a handle on them as soon as they noticed but not before many of them had gone out. It shouldn't happen again. PROUD MEMBER OF Team Starfire World BOINC ID: 1038286 ·

OTS Volunteer tester Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0	Message 1038302 - Posted: 2 Oct 2010, 13:57:44 UTC - in response to Message 1038179. I am one of those people that have been returning results (12+) with supposedly "0" seconds of work done and having the results validated for "0" credit even though my wingman shows a lot of seconds for their results. The good news, if there is any, is that none of mine seem to be from Tim, Jim, or or any other names that contribute here regularly so I am not in fear of my life, but I still feel bad for those that did lose credit. Unfortunately I have no idea why it happened as nothing has changed at my end to account for this as far as I know. I also don't understand why if my wingman has actual results to show for his work while I have "Stderr output" to show for mine, how the two results could be compared and entered as valid. I would have thought if I had really returned a zero result, it should have been entered as an invalid result and the work sent out again. One explanation might be that I did submit the same result but the validation process failed to give me credit and the lowest credit prevailed. This is not a comforting thought if it can continue to happen. ... DSH Validation is based on the uploaded result files. What's shown on the task detail pages is mostly information about the task sent in the request to the Scheduler. That only affects credit, not the science. Whatever is going wrong, it must be server code. The client surely is sending the same information as before. There's an option in the project configuration: <dont_store_success_stderr/> If present, don't store the stderr log in the database for successful workunits. May be useful to save on database size. Available since r18528. Trying that would account for not having a stderr section to show, but if it is bugged bad enough to also not save the runtime and CPU time I'd guess no project has tried it since it was added in June 2009. Joe Well the count is now up to 30+ now and for a plodder like me that doesn't submit a large number of results this is a little discouraging. What I don't understand is that if it is server side code that is creating the problem, why is it when I look at the results it is always me that is not showing any results and zero CPU time resulting in zero credit? Unless I interpreting the data wrong, I have not found a single zero credit result that was determined by my wingman's result. It would appear that even if it is server side code that is creating the error, that there is something in my results that is the trigger. At the end of August I finally bit the bullet and upgraded the client from 5.28 to 6.03 but I didn't see anything different until this reporting period. ???? DSH ID: 1038302 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1038309 - Posted: 2 Oct 2010, 14:41:11 UTC - in response to Message 1038302. I am one of those people that have been returning results (12+) with supposedly "0" seconds of work done and having the results validated for "0" credit even though my wingman shows a lot of seconds for their results. The good news, if there is any, is that none of mine seem to be from Tim, Jim, or or any other names that contribute here regularly so I am not in fear of my life, but I still feel bad for those that did lose credit. Unfortunately I have no idea why it happened as nothing has changed at my end to account for this as far as I know. I also don't understand why if my wingman has actual results to show for his work while I have "Stderr output" to show for mine, how the two results could be compared and entered as valid. I would have thought if I had really returned a zero result, it should have been entered as an invalid result and the work sent out again. One explanation might be that I did submit the same result but the validation process failed to give me credit and the lowest credit prevailed. This is not a comforting thought if it can continue to happen. ... DSH Validation is based on the uploaded result files. What's shown on the task detail pages is mostly information about the task sent in the request to the Scheduler. That only affects credit, not the science. Whatever is going wrong, it must be server code. The client surely is sending the same information as before. There's an option in the project configuration: <dont_store_success_stderr/> If present, don't store the stderr log in the database for successful workunits. May be useful to save on database size. Available since r18528. Trying that would account for not having a stderr section to show, but if it is bugged bad enough to also not save the runtime and CPU time I'd guess no project has tried it since it was added in June 2009. Joe Well the count is now up to 30+ now and for a plodder like me that doesn't submit a large number of results this is a little discouraging. What I don't understand is that if it is server side code that is creating the problem, why is it when I look at the results it is always me that is not showing any results and zero CPU time resulting in zero credit? Unless I interpreting the data wrong, I have not found a single zero credit result that was determined by my wingman's result. It would appear that even if it is server side code that is creating the error, that there is something in my results that is the trigger. At the end of August I finally bit the bullet and upgraded the client from 5.28 to 6.03 but I didn't see anything different until this reporting period. ???? DSH You could try updating Boinc to the current recommended version, 6.10.58, Boinc 6.2.14 is rather old, New Credit uses Run time for part of it's calculation's, if the Boinc version is old and doesn't report Run time, then CPU time is supposed to be used instead, but if your Boinc version is reporting Zero Run time, then Peak FLOP Count is going to be Zero too, ending up with Zero Credit being granted. Claggy ID: 1038309 ·

rebest Volunteer tester Send message Joined: 16 Apr 00 Posts: 1296 Credit: 45,357,093 RAC: 0	Message 1038362 - Posted: 2 Oct 2010, 17:50:51 UTC Is the plan still to give us one full week without the three-day shutdown? My pendings are through the roof (10 days RAC). It would be nice to let things settle down before starting the weekly data crunch back up. Join the PACK! ID: 1038362 ·

[B^S] madmac Volunteer tester Send message Joined: 9 Feb 04 Posts: 1175 Credit: 4,754,897 RAC: 0	Message 1038369 - Posted: 2 Oct 2010, 19:08:43 UTC I have had problems getting on site, I was told it was something to do with port 80 either a firewall or security problem there end. Everything seems to be slow was told my firewall was ok any ideas as to what is causing the problems ID: 1038369 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1038381 - Posted: 2 Oct 2010, 19:35:54 UTC And the forums are just trashed.... This is painful. Don't know if this is the master database dragging everything down with it or some kinda network issue. The computers all seem to be connecting OK though, so all is not lost. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1038381 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1038382 - Posted: 2 Oct 2010, 19:38:18 UTC - in response to Message 1038381. as long as it is just the board being laggy.. it is all good. Come one.. stay up servers!! (too bad there is not a thumbs up/down for posts like this) Janice ID: 1038382 ·

Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20	Message 1038384 - Posted: 2 Oct 2010, 19:47:25 UTC - in response to Message 1038383. And the forums are just trashed.... This is painful. Don't know if this is the master database dragging everything down with it or some kinda network issue. The computers all seem to be connecting OK though, so all is not lost. Well, the Data Distribution State on the Server status page hasn't updated for 5 hours. The page itself is up to date, but not the data within it. Me think it's the master database that is slowly diggin itself down. Something gave - 3 hours ago I could not even connect to the homepage, now I can get to everything, and apparently at normal speed. Donald Infernal Optimist / Submariner, retired ID: 1038384 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1038419 - Posted: 2 Oct 2010, 21:34:10 UTC My E7600 can connect well. But my 940 BE can UL, but not report/request: Since 17:12 UTC: Scheduler request failed: HTTP internal server error Reboot didn't helped. Someone else have problems too? ID: 1038419 ·

Jamie Volunteer tester Send message Joined: 5 Apr 06 Posts: 162 Credit: 9,867,955 RAC: 0	Message 1038421 - Posted: 2 Oct 2010, 21:38:28 UTC - in response to Message 1038419. Yep, just had to replace a dead HDD so hoping to get some wu's to replace the lost ones and now can't connect (but managed to attach again) to download anymore ID: 1038421 ·

JohnDK Volunteer tester Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127	Message 1038422 - Posted: 2 Oct 2010, 21:38:36 UTC - in response to Message 1038419. Last modified: 2 Oct 2010, 21:40:23 UTC My E7600 can connect well. But my 940 BE can UL, but not report/request: Since 17:12 UTC: Scheduler request failed: HTTP internal server error Reboot didn't helped. Someone else have problems too? My 2 PCs reports and requests normal, only issue is getting work, I guess out of 10 request only 1 gives work, and then only a few. I guess servers is busy, but than again, crickets doesn't show it being busy. ID: 1038422 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1038423 - Posted: 2 Oct 2010, 21:44:45 UTC Last modified: 2 Oct 2010, 21:49:57 UTC Thanks to you both. But I guess I have an other problem, because my BOINC on 940 BE get [EDIT: (since ~ 4 1/2 hours) only] the error message: Scheduler request failed: HTTP internal server error You got/get same message? Strange is, only one of two PCs get this message. ID: 1038423 ·

Jamie Volunteer tester Send message Joined: 5 Apr 06 Posts: 162 Credit: 9,867,955 RAC: 0	Message 1038424 - Posted: 2 Oct 2010, 21:45:48 UTC - in response to Message 1038423. I get the same message : 02/10/2010 22:45:01 \| SETI@home \| Scheduler request failed: HTTP internal server error ID: 1038424 ·

JohnDK Volunteer tester Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127	Message 1038425 - Posted: 2 Oct 2010, 21:47:37 UTC - in response to Message 1038423. Thanks to you both. But I guess I have an other problem, because my BOINC on 940 BE get the error message: Scheduler request failed: HTTP internal server error You got/get same message? Strange is, only one of two PCs get this message. No errors here, all have gone well and quick reporting/requesting work. Except just now, think an AP spike is beginning. ID: 1038425 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1038430 - Posted: 2 Oct 2010, 21:54:43 UTC Last modified: 2 Oct 2010, 21:57:21 UTC Thanks again.. BOINC got now a new error message after ~ 4 1/2 hours..: Project communication failed: attempting access to reference site Scheduler request failed: Couldn't connect to server Internet access OK - project servers may be temporarily down. And then again the old: Scheduler request failed: HTTP internal server error Ohh.. a pity. I guess, it's not at my end. EDIT: But strange is, that only one of two PCs have problems.. ID: 1038430 ·

Robert Ribbeck Send message Joined: 7 Jun 02 Posts: 644 Credit: 5,283,174 RAC: 0	Message 1038431 - Posted: 2 Oct 2010, 21:54:44 UTC - in response to Message 1038421. Sorry for your lost hd May I suggest giving Steve Gibsons Spin rite a try on that drive esp if it was running WINBLOWs fixes drive problems Like nothing else can Would have sent this via a pm but the ghost is blocking me ID: 1038431 ·

Questor Volunteer tester Send message Joined: 3 Sep 04 Posts: 471 Credit: 230,506,401 RAC: 157	Message 1038558 - Posted: 3 Oct 2010, 3:20:17 UTC Is anyone else having problem reporting? Most of my machines are reporting one or two tasks at a time OK. However I have one which hasnt reported for a time and now has over 100 tasks to report. When it reports I get 03/10/2010 04:14:38 SETI@home update requested by user 03/10/2010 04:14:41 SETI@home Sending scheduler request: Requested by user. 03/10/2010 04:14:41 SETI@home Reporting 109 completed tasks, requesting new tasks for CPU and GPU 03/10/2010 04:16:13 SETI@home Scheduler request failed: HTTP internal server error Have rebooted machine. GPU Users Group ID: 1038558 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1038560 - Posted: 3 Oct 2010, 3:31:30 UTC - in response to Message 1038558. Last modified: 3 Oct 2010, 3:36:21 UTC E7600: Reporting 51 completed tasks, requesting new tasks for CPU and GPU Scheduler request failed: HTTP internal server error Last scheduler contact: 22:41 UTC 940 BE: Reporting 360 completed tasks, requesting new tasks for GPU Scheduler request failed: HTTP internal server error Last scheduler contact: 17:12 UTC EDIT: BTW. AFAIK. 'HTTP internal server error' means a prob with the S@h scheduler. ID: 1038560 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 1038562 - Posted: 3 Oct 2010, 3:54:19 UTC Hmm.............. Last contacted SETI on 10 Sep 2010 22:17:04 UTC. I would perform a project reset. More important to get the machine back online with SETI since the completed work is probably lost anyway. Be advised that you will probably lose the optimized apps installation also so be prepared to reinstall the optimized apps again. This is what I would do but it's your machine............... Boinc....Boinc....Boinc....Boinc.... ID: 1038562 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.