Internal server error...

Message boards : Number crunching : Internal server error...
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Rabbit&Carrot

Send message
Joined: 3 Oct 03
Posts: 25
Credit: 80,178,117
RAC: 0
Korea, South
Message 1233682 - Posted: 19 May 2012, 23:17:48 UTC - in response to Message 1233357.  
Last modified: 19 May 2012, 23:20:00 UTC

since recovery all boxes have uploaded ok, except one. this box is trying to report 2500+ wus and am getting "sceduler request failed: internal server error" each time it tries to report the wus...

cheers

Try upgrading to Boinc 6.12.x then use the following cc_config.xml

<cc_config>
   <log_flags>
   </log_flags>
   <options>
      <max_tasks_reported>200</max_tasks_reported>
   </options>
</cc_config>


I managed to report ~700 Wu's in one go yesterday, while Mark needed to use this to report his 1800 to 2400 Wu's

Afterwards uninstall 6.12.x and reinstall 6.10.58

Claggy


I tried this (upgrading BOINC Manager from 6.10.60 to 6.12.34 and creating cc_config.xml file), but it didn't work for me. Is there any other way for my rig to connect to the server and report 800+ WU's?
ID: 1233682 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1233695 - Posted: 19 May 2012, 23:32:38 UTC - in response to Message 1233682.  

since recovery all boxes have uploaded ok, except one. this box is trying to report 2500+ wus and am getting "sceduler request failed: internal server error" each time it tries to report the wus...

cheers

Try upgrading to Boinc 6.12.x then use the following cc_config.xml

<cc_config>
   <log_flags>
   </log_flags>
   <options>
      <max_tasks_reported>200</max_tasks_reported>
   </options>
</cc_config>


I managed to report ~700 Wu's in one go yesterday, while Mark needed to use this to report his 1800 to 2400 Wu's

Afterwards uninstall 6.12.x and reinstall 6.10.58

Claggy


I tried this (upgrading BOINC Manager from 6.10.60 to 6.12.34 and creating cc_config.xml file), but it didn't work for me. Is there any other way for my rig to connect to the server and report 800+ WU's?


The cc_config file goes in the root BOINC folder and not the seti folder.

So the same folder that also contains client_state and stdoutdae.


ID: 1233695 · Report as offensive
Rabbit&Carrot

Send message
Joined: 3 Oct 03
Posts: 25
Credit: 80,178,117
RAC: 0
Korea, South
Message 1233709 - Posted: 20 May 2012, 0:05:21 UTC - in response to Message 1233695.  

The cc_config file goes in the root BOINC folder and not the seti folder.

So the same folder that also contains client_state and stdoutdae.


Thank you very much. It now works!
ID: 1233709 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1233716 - Posted: 20 May 2012, 0:12:42 UTC

Glad we could help.

ID: 1233716 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1233889 - Posted: 20 May 2012, 5:57:48 UTC

Yesterday I added the mod to all of my rigs, whether they needed it or not right now. It won't come into play unless it would be needed due to a server outage preventing reporting and work to send back to Seti builds up.
Otherwise it will just stay transparent.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1233889 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1234434 - Posted: 21 May 2012, 9:08:09 UTC

All
I have had a look at the comments above after putting this up and the following are just my comments against some of those expressed.

Claggy, your comment “And the whole procedure can probably be done in 15 to 20 minutes, and 5 minutes after that they could be refilling their cache again” is not quite right. From the perspective of a single one time event it would be correct. The issue in here is that it is per machine per outage and the probability of any machine getting this message is a function of processing capability and duration of outage.

Dave went down the path you suggested but it hasn’t really solved the problem. On the next (unscheduled) outage, in all likely hood he is going to hit the same problem.

To that end I agree with tbret who stated “Someone better tell someone to do whatever is necessary to fix this reporting bug, though. The longer and more total the outage, and the bigger the cruncher, the more likely more work will be crunched and uploaded, but unreported.”

The comment is correct and when you think about the new generation of GPUs, and how easy it is for anyone to place a second or third GPU in a box, you can see that it won’t take long for the general mass to start experiencing this problem.

As to me, I am basically in the set & forget camp. It may not look like it when you look at the boxes that I run but that is my basic profile. It would be true to say that I am time poor and this is where I think Richard is wrong. There are many out there that are set and forget and as they roll machines and GPUs, this is going to become more prevalent and problematic with each outage, scheduled or not.

The solution is to fix the server side software to remove the problem such that any iteration of seti@home can work properly with the servers at Berkeley regardless of the number of work units downloaded.

In the mean time can someone reboot the appropriate servers, or do whatever at an administrator level and help clear this issue for this machine of mine greatly appreciated.

cheers

ID: 1234434 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1234444 - Posted: 21 May 2012, 10:39:28 UTC - in response to Message 1234434.  

All
I have had a look at the comments above after putting this up and the following are just my comments against some of those expressed.

Claggy, your comment “And the whole procedure can probably be done in 15 to 20 minutes, and 5 minutes after that they could be refilling their cache again” is not quite right. From the perspective of a single one time event it would be correct. The issue in here is that it is per machine per outage and the probability of any machine getting this message is a function of processing capability and duration of outage.

Dave went down the path you suggested but it hasn’t really solved the problem. On the next (unscheduled) outage, in all likely hood he is going to hit the same problem.

To that end I agree with tbret who stated “Someone better tell someone to do whatever is necessary to fix this reporting bug, though. The longer and more total the outage, and the bigger the cruncher, the more likely more work will be crunched and uploaded, but unreported.”

The comment is correct and when you think about the new generation of GPUs, and how easy it is for anyone to place a second or third GPU in a box, you can see that it won’t take long for the general mass to start experiencing this problem.

As to me, I am basically in the set & forget camp. It may not look like it when you look at the boxes that I run but that is my basic profile. It would be true to say that I am time poor and this is where I think Richard is wrong. There are many out there that are set and forget and as they roll machines and GPUs, this is going to become more prevalent and problematic with each outage, scheduled or not.

The solution is to fix the server side software to remove the problem such that any iteration of seti@home can work properly with the servers at Berkeley regardless of the number of work units downloaded.

In the mean time can someone reboot the appropriate servers, or do whatever at an administrator level and help clear this issue for this machine of mine greatly appreciated.

cheers

The fix is to move the scheduler to the Campus link so scheduler contacts aren't competing with uploads and downloads on the Hurricane link, there's Politics involved there so we'll probably be waiting a while,
Or increase the timeout from 5 minutes to 10 minutes, but i'm sure they'll be consequences to that change,

Claggy
ID: 1234444 · Report as offensive
Profile Khangollo
Avatar

Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1234447 - Posted: 21 May 2012, 10:46:13 UTC - in response to Message 1234444.  
Last modified: 21 May 2012, 10:48:18 UTC

The fix is to move the scheduler to the Campus link so scheduler contacts aren't competing with uploads and downloads on the Hurricane link, there's Politics involved there so we'll probably be waiting a while,
Or increase the timeout from 5 minutes to 10 minutes, but i'm sure they'll be consequences to that change,

Claggy

I don't think this issue has anything to do with link, but with the database server(s) not being able to process that many workunits in time until the http request times out.

I agree with Lionel, this IS an issue with servers and should be addressed somehow. Not everyone will check their clients and make an adjustment to cc_config.xml and a lot of work is going to waste, needlesly clog the database and link with resends and make a lot of wingmen wait.
ID: 1234447 · Report as offensive
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 1234568 - Posted: 21 May 2012, 15:44:23 UTC

I've got some ideas where the problem might be. I'll start digging.
@SETIEric@qoto.org (Mastodon)

ID: 1234568 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1234583 - Posted: 21 May 2012, 16:25:42 UTC - in response to Message 1234568.  

I've got some ideas where the problem might be. I'll start digging.

Thanks, Eric.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1234583 · Report as offensive
Dave

Send message
Joined: 29 Mar 02
Posts: 778
Credit: 25,001,396
RAC: 0
United Kingdom
Message 1234627 - Posted: 21 May 2012, 18:22:39 UTC

Thanks Eric.
ID: 1234627 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1234642 - Posted: 21 May 2012, 18:58:30 UTC - in response to Message 1234568.  

I've got some ideas where the problem might be. I'll start digging.

Thanks Eric,

Claggy
ID: 1234642 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1234751 - Posted: 21 May 2012, 22:05:52 UTC - in response to Message 1234568.  

I've got some ideas where the problem might be. I'll start digging.


thank you Eric


ID: 1234751 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1234752 - Posted: 21 May 2012, 22:11:28 UTC - in response to Message 1234568.  

I've got some ideas where the problem might be. I'll start digging.


Thanks for looking into it, Eric. Methinks it warrants your time. Nice to know you're on the job.

I don't know how you face this stuff e v e r y s i n g l e d a y.

You're a good guy. ...but I hope you have the good sense not to tell your wife anyone said that.

It's sooooooo deflating when they laugh that hard.

ID: 1234752 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1235628 - Posted: 24 May 2012, 0:04:45 UTC - in response to Message 1234752.  

I can that this problem appears to be now manifesting itself in other issues (due recent unannounced server side change ..http://setiathome.berkeley.edu/forum_thread.php?id=68164

anyway my problem was still there this morning and couldn't report 3000 wus nor get any new wus, so blew the lot away ... now chomping down on wus to fill cache ... should take a few days to fill up ... so that spike in dnld bandwidth is probably me ...

good luck on fixing the issue ...

cheers
ID: 1235628 · Report as offensive
Profile Khangollo
Avatar

Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1235631 - Posted: 24 May 2012, 0:23:50 UTC - in response to Message 1235628.  
Last modified: 24 May 2012, 0:27:05 UTC

Way to ruin a *lot* of already crunched work... :)
Temp. upgrading was too hard, eh?
ID: 1235631 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1235650 - Posted: 24 May 2012, 1:48:13 UTC - in response to Message 1235631.  

Way to ruin a *lot* of already crunched work... :)
Temp. upgrading was too hard, eh?


upgrading and downgrading a client isn't the way to fix a server side issue ... the issue is going to become more prevalent unless it is addressed, and that is what they actually need to do ... which is why Eric is looking into it ... as to the wus, easy come easy go ...

cheers and happy crunching
ID: 1235650 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Internal server error...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.