Panic Mode On (39) Server problems

Message boards : Number crunching : Panic Mode On (39) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next

AuthorMessage
Profile Jim_S
Avatar

Send message
Joined: 23 Feb 00
Posts: 4705
Credit: 64,560,357
RAC: 31
United States
Message 1038248 - Posted: 2 Oct 2010, 10:42:14 UTC

I only have one (Working)box that has not reported yet.
But she's trying.

I Desire Peace and Justice, Jim Scott (Mod-Ret.)
ID: 1038248 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1038286 - Posted: 2 Oct 2010, 13:05:18 UTC

Vic, I'm not going to try to find the thread but Matt apologized about those VLARs. Seems a whole herd of them accidentally got released into the wild. The boys got a handle on them as soon as they noticed but not before many of them had gone out. It shouldn't happen again.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1038286 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1038302 - Posted: 2 Oct 2010, 13:57:44 UTC - in response to Message 1038179.  


I am one of those people that have been returning results (12+) with supposedly "0" seconds of work done and having the results validated for "0" credit even though my wingman shows a lot of seconds for their results. The good news, if there is any, is that none of mine seem to be from Tim, Jim, or or any other names that contribute here regularly so I am not in fear of my life, but I still feel bad for those that did lose credit.

Unfortunately I have no idea why it happened as nothing has changed at my end to account for this as far as I know. I also don't understand why if my wingman has actual results to show for his work while I have "Stderr output" to show for mine, how the two results could be compared and entered as valid. I would have thought if I had really returned a zero result, it should have been entered as an invalid result and the work sent out again. One explanation might be that I did submit the same result but the validation process failed to give me credit and the lowest credit prevailed. This is not a comforting thought if it can continue to happen.
...
DSH

Validation is based on the uploaded result files. What's shown on the task detail pages is mostly information about the task sent in the request to the Scheduler. That only affects credit, not the science.

Whatever is going wrong, it must be server code. The client surely is sending the same information as before. There's an option in the project configuration:

<dont_store_success_stderr/>

If present, don't store the stderr log in the database for successful workunits. May be useful to save on database size. Available since r18528.

Trying that would account for not having a stderr section to show, but if it is bugged bad enough to also not save the runtime and CPU time I'd guess no project has tried it since it was added in June 2009.
                                                                Joe


Well the count is now up to 30+ now and for a plodder like me that doesn't submit a large number of results this is a little discouraging. What I don't understand is that if it is server side code that is creating the problem, why is it when I look at the results it is always me that is not showing any results and zero CPU time resulting in zero credit?

Unless I interpreting the data wrong, I have not found a single zero credit result that was determined by my wingman's result. It would appear that even if it is server side code that is creating the error, that there is something in my results that is the trigger. At the end of August I finally bit the bullet and upgraded the client from 5.28 to 6.03 but I didn't see anything different until this reporting period. ???? DSH
ID: 1038302 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1038309 - Posted: 2 Oct 2010, 14:41:11 UTC - in response to Message 1038302.  


I am one of those people that have been returning results (12+) with supposedly "0" seconds of work done and having the results validated for "0" credit even though my wingman shows a lot of seconds for their results. The good news, if there is any, is that none of mine seem to be from Tim, Jim, or or any other names that contribute here regularly so I am not in fear of my life, but I still feel bad for those that did lose credit.

Unfortunately I have no idea why it happened as nothing has changed at my end to account for this as far as I know. I also don't understand why if my wingman has actual results to show for his work while I have "Stderr output" to show for mine, how the two results could be compared and entered as valid. I would have thought if I had really returned a zero result, it should have been entered as an invalid result and the work sent out again. One explanation might be that I did submit the same result but the validation process failed to give me credit and the lowest credit prevailed. This is not a comforting thought if it can continue to happen.
...
DSH

Validation is based on the uploaded result files. What's shown on the task detail pages is mostly information about the task sent in the request to the Scheduler. That only affects credit, not the science.

Whatever is going wrong, it must be server code. The client surely is sending the same information as before. There's an option in the project configuration:

<dont_store_success_stderr/>

If present, don't store the stderr log in the database for successful workunits. May be useful to save on database size. Available since r18528.

Trying that would account for not having a stderr section to show, but if it is bugged bad enough to also not save the runtime and CPU time I'd guess no project has tried it since it was added in June 2009.
                                                                Joe


Well the count is now up to 30+ now and for a plodder like me that doesn't submit a large number of results this is a little discouraging. What I don't understand is that if it is server side code that is creating the problem, why is it when I look at the results it is always me that is not showing any results and zero CPU time resulting in zero credit?

Unless I interpreting the data wrong, I have not found a single zero credit result that was determined by my wingman's result. It would appear that even if it is server side code that is creating the error, that there is something in my results that is the trigger. At the end of August I finally bit the bullet and upgraded the client from 5.28 to 6.03 but I didn't see anything different until this reporting period. ???? DSH

You could try updating Boinc to the current recommended version, 6.10.58, Boinc 6.2.14 is rather old,
New Credit uses Run time for part of it's calculation's, if the Boinc version is old and doesn't report Run time, then CPU time is supposed to be used instead,
but if your Boinc version is reporting Zero Run time, then Peak FLOP Count is going to be Zero too, ending up with Zero Credit being granted.

Claggy

ID: 1038309 · Report as offensive
Profile rebest Project Donor
Volunteer tester
Avatar

Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 45,357,093
RAC: 0
United States
Message 1038362 - Posted: 2 Oct 2010, 17:50:51 UTC

Is the plan still to give us one full week without the three-day shutdown? My pendings are through the roof (10 days RAC). It would be nice to let things settle down before starting the weekly data crunch back up.

Join the PACK!
ID: 1038362 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1038369 - Posted: 2 Oct 2010, 19:08:43 UTC

I have had problems getting on site, I was told it was something to do with port 80 either a firewall or security problem there end. Everything seems to be slow was told my firewall was ok any ideas as to what is causing the problems
ID: 1038369 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1038381 - Posted: 2 Oct 2010, 19:35:54 UTC

And the forums are just trashed....
This is painful.

Don't know if this is the master database dragging everything down with it or some kinda network issue.

The computers all seem to be connecting OK though, so all is not lost.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1038381 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1038382 - Posted: 2 Oct 2010, 19:38:18 UTC - in response to Message 1038381.  

as long as it is just the board being laggy.. it is all good. Come one.. stay up servers!!

(too bad there is not a thumbs up/down for posts like this)
Janice
ID: 1038382 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1038384 - Posted: 2 Oct 2010, 19:47:25 UTC - in response to Message 1038383.  

And the forums are just trashed....
This is painful.

Don't know if this is the master database dragging everything down with it or some kinda network issue.

The computers all seem to be connecting OK though, so all is not lost.


Well, the Data Distribution State on the Server status page hasn't updated for 5 hours. The page itself is up to date, but not the data within it.

Me think it's the master database that is slowly diggin itself down.

Something gave - 3 hours ago I could not even connect to the homepage, now I can get to everything, and apparently at normal speed.

Donald
Infernal Optimist / Submariner, retired
ID: 1038384 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1038419 - Posted: 2 Oct 2010, 21:34:10 UTC

My E7600 can connect well.

But my 940 BE can UL, but not report/request:
Since 17:12 UTC: Scheduler request failed: HTTP internal server error
Reboot didn't helped.

Someone else have problems too?

ID: 1038419 · Report as offensive
Jamie
Volunteer tester

Send message
Joined: 5 Apr 06
Posts: 162
Credit: 9,867,955
RAC: 0
United Kingdom
Message 1038421 - Posted: 2 Oct 2010, 21:38:28 UTC - in response to Message 1038419.  

Yep, just had to replace a dead HDD so hoping to get some wu's to replace the lost ones
and now can't connect (but managed to attach again) to download anymore
ID: 1038421 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1038422 - Posted: 2 Oct 2010, 21:38:36 UTC - in response to Message 1038419.  
Last modified: 2 Oct 2010, 21:40:23 UTC

My E7600 can connect well.

But my 940 BE can UL, but not report/request:
Since 17:12 UTC: Scheduler request failed: HTTP internal server error
Reboot didn't helped.

Someone else have problems too?

My 2 PCs reports and requests normal, only issue is getting work, I guess out of 10 request only 1 gives work, and then only a few. I guess servers is busy, but than again, crickets doesn't show it being busy.
ID: 1038422 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1038423 - Posted: 2 Oct 2010, 21:44:45 UTC
Last modified: 2 Oct 2010, 21:49:57 UTC

Thanks to you both.

But I guess I have an other problem, because my BOINC on 940 BE get [EDIT: (since ~ 4 1/2 hours) only] the error message: Scheduler request failed: HTTP internal server error

You got/get same message?

Strange is, only one of two PCs get this message.
ID: 1038423 · Report as offensive
Jamie
Volunteer tester

Send message
Joined: 5 Apr 06
Posts: 162
Credit: 9,867,955
RAC: 0
United Kingdom
Message 1038424 - Posted: 2 Oct 2010, 21:45:48 UTC - in response to Message 1038423.  

I get the same message
: 02/10/2010 22:45:01 | SETI@home | Scheduler request failed: HTTP internal server error


ID: 1038424 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1038425 - Posted: 2 Oct 2010, 21:47:37 UTC - in response to Message 1038423.  

Thanks to you both.

But I guess I have an other problem, because my BOINC on 940 BE get the error message: Scheduler request failed: HTTP internal server error

You got/get same message?

Strange is, only one of two PCs get this message.

No errors here, all have gone well and quick reporting/requesting work.

Except just now, think an AP spike is beginning.
ID: 1038425 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1038430 - Posted: 2 Oct 2010, 21:54:43 UTC
Last modified: 2 Oct 2010, 21:57:21 UTC

Thanks again..

BOINC got now a new error message after ~ 4 1/2 hours..:
Project communication failed: attempting access to reference site
Scheduler request failed: Couldn't connect to server
Internet access OK - project servers may be temporarily down.


And then again the old:
Scheduler request failed: HTTP internal server error

Ohh.. a pity.

I guess, it's not at my end.


EDIT: But strange is, that only one of two PCs have problems..
ID: 1038430 · Report as offensive
Robert Ribbeck
Avatar

Send message
Joined: 7 Jun 02
Posts: 644
Credit: 5,283,174
RAC: 0
United States
Message 1038431 - Posted: 2 Oct 2010, 21:54:44 UTC - in response to Message 1038421.  

Sorry for your lost hd

May I suggest giving Steve Gibsons Spin rite a try on that drive

esp if it was running WINBLOWs

fixes drive problems Like nothing else can

Would have sent this via a pm but the ghost is blocking me
ID: 1038431 · Report as offensive
Profile Questor Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 3 Sep 04
Posts: 471
Credit: 230,506,401
RAC: 157
United Kingdom
Message 1038558 - Posted: 3 Oct 2010, 3:20:17 UTC

Is anyone else having problem reporting?

Most of my machines are reporting one or two tasks at a time OK.

However I have one which hasnt reported for a time and now has over 100 tasks to report.

When it reports I get

03/10/2010 04:14:38 SETI@home update requested by user
03/10/2010 04:14:41 SETI@home Sending scheduler request: Requested by user.
03/10/2010 04:14:41 SETI@home Reporting 109 completed tasks, requesting new tasks for CPU and GPU
03/10/2010 04:16:13 SETI@home Scheduler request failed: HTTP internal server error


Have rebooted machine.
GPU Users Group



ID: 1038558 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1038560 - Posted: 3 Oct 2010, 3:31:30 UTC - in response to Message 1038558.  
Last modified: 3 Oct 2010, 3:36:21 UTC

E7600:
Reporting 51 completed tasks, requesting new tasks for CPU and GPU
Scheduler request failed: HTTP internal server error


Last scheduler contact: 22:41 UTC


940 BE:
Reporting 360 completed tasks, requesting new tasks for GPU
Scheduler request failed: HTTP internal server error


Last scheduler contact: 17:12 UTC


EDIT: BTW. AFAIK. 'HTTP internal server error' means a prob with the S@h scheduler.
ID: 1038560 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 1038562 - Posted: 3 Oct 2010, 3:54:19 UTC

Hmm..............

Last contacted SETI on 10 Sep 2010 22:17:04 UTC.

I would perform a project reset. More important to get the machine back online with SETI since the completed work is probably lost anyway.

Be advised that you will probably lose the optimized apps installation also so be prepared to reinstall the optimized apps again.

This is what I would do but it's your machine...............

Boinc....Boinc....Boinc....Boinc....
ID: 1038562 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (39) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.