Panic Mode On (59) Server problems?

Message boards : Number crunching : Panic Mode On (59) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 12 · Next

AuthorMessage
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1161518 - Posted: 12 Oct 2011, 15:18:57 UTC - in response to Message 1161485.  

Another 'bug'?

No, that's a feature, I think (works as designed:-). ;-)

Gruß,
Gundolf


the usual problem with any quick fixes - they get implemented in a hurry to solve an urgent problem, but don't get completely thought through.

In this case the urgent problem was that old (i.e. stock and x32f) CUDA code couldn't handle VLAR tasks very well - taking ages to crunch and in many cases making the display so sluggish it was impossible to work on the system.

The quick fix was to check the AR of the tasks, mark any with an AR below [whatever, I keep forgetting - 0.2?] as _vlar and tell the scheduler not to send those to NVidia GPUs.

What nobody thought of was that on mixed requests, the scheduler tries to satisfy first one (normally GPU) and then the other part. Only after the first part has been satisfied it looks if it has something for the second part. And it doesn't check a second time if any of the tasks which didn't match the first part are ok for the second part.

You could call that a design flaw.

Current optimised CUDA apps (intended to become stock on V7) can handle VLAR nicely. [at least on the handful of Alpha tester systems] So, once testing on Beta has started, we'll see about getting that particular distinction disabled, making sure the VLAR tasks no longer pose a problem and carry that through to main, when it gets rolled out.

Somebody was asking about a V7 Lunatics installer - we are missing a few vital pieces of information, (not to mention an app or two) but we'll try to get one out as fast as possible after the rollout.
X38g can handle V7 tasks, but the app_info.xml entries need to be adapted and we don't have a template yet.
ID: 1161518 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1161530 - Posted: 12 Oct 2011, 16:07:22 UTC

Every time my main cruncher asked for work in the past 12 hours, it gets a single AP in response.

2011-10-11 23:42:47|SETI@home|Sending scheduler request: To fetch work. Requesting 125237 seconds of work, reporting 2 completed tasks
2011-10-11 23:42:52|SETI@home|Scheduler request succeeded: got 1 new tasks
2011-10-11 23:42:54|SETI@home|Started download of ap_25se11ah_B6_P0_00300_20111011_13645.wu
2011-10-11 23:47:01|SETI@home|Finished download of ap_25se11ah_B6_P0_00300_20111011_13645.wu
2011-10-11 23:47:59|SETI@home|Sending scheduler request: To fetch work. Requesting 50249 seconds of work, reporting 0 completed tasks
2011-10-11 23:48:04|SETI@home|Scheduler request succeeded: got 1 new tasks
2011-10-11 23:48:06|SETI@home|Started download of ap_26se11ab_B4_P1_00099_20111011_27845.wu
2011-10-11 23:52:30|SETI@home|Finished download of ap_26se11ab_B4_P1_00099_20111011_27845.wu
2011-10-11 23:53:11|SETI@home|Sending scheduler request: To fetch work. Requesting 3361 seconds of work, reporting 0 completed tasks
2011-10-11 23:53:16|SETI@home|Scheduler request succeeded: got 1 new tasks
2011-10-11 23:53:18|SETI@home|Started download of ap_25se11ah_B6_P0_00407_20111011_27026.wu
2011-10-11 23:57:54|SETI@home|Finished download of ap_25se11ah_B6_P0_00407_20111011_27026.wu
2011-10-12 03:44:32|SETI@home|Sending scheduler request: To fetch work. Requesting 85 seconds of work, reporting 0 completed tasks
2011-10-12 03:44:37|SETI@home|Scheduler request succeeded: got 1 new tasks
2011-10-12 03:44:39|SETI@home|Started download of ap_26se11ab_B5_P1_00402_20111011_19585.wu
2011-10-12 03:49:04|SETI@home|Finished download of ap_26se11ab_B5_P1_00402_20111011_19585.wu
2011-10-12 06:37:53|SETI@home|Computation for task ap_24ap11ah_B4_P1_00387_20111007_14448.wu_0 finished
2011-10-12 06:37:53|SETI@home|Starting ap_19fe11ag_B1_P1_00111_20111007_08812.wu_0
2011-10-12 06:37:53|SETI@home|Starting task ap_19fe11ag_B1_P1_00111_20111007_08812.wu_0 using astropulse_v505 version 505
2011-10-12 06:37:53|SETI@home|Sending scheduler request: To fetch work. Requesting 12893 seconds of work, reporting 0 completed tasks
2011-10-12 06:37:55|SETI@home|Started upload of ap_24ap11ah_B4_P1_00387_20111007_14448.wu_0_0
2011-10-12 06:37:57|SETI@home|Finished upload of ap_24ap11ah_B4_P1_00387_20111007_14448.wu_0_0
2011-10-12 06:38:04|SETI@home|Scheduler request succeeded: got 1 new tasks
2011-10-12 06:38:06|SETI@home|Started download of ap_26se11ad_B4_P1_00375_20111011_04259.wu
2011-10-12 06:43:06|SETI@home|Finished download of ap_26se11ad_B4_P1_00375_20111011_04259.wu
2011-10-12 10:20:26|SETI@home|Sending scheduler request: To fetch work. Requesting 225 seconds of work, reporting 1 completed tasks
2011-10-12 10:20:31|SETI@home|Scheduler request succeeded: got 1 new tasks
2011-10-12 10:20:33|SETI@home|Started download of ap_26se11ac_B5_P0_00332_20111012_10694.wu
2011-10-12 10:24:28|SETI@home|Finished download of ap_26se11ac_B5_P0_00332_20111012_10694.wu


Maybe I'm lucky.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1161530 · Report as offensive
Profile Paris
Avatar

Send message
Joined: 20 May 99
Posts: 110
Credit: 1,012,250
RAC: 0
United States
Message 1161541 - Posted: 12 Oct 2011, 16:45:51 UTC - in response to Message 1161499.  

No uploads, no downloads, no tasks able to report since Oct. 5. Usually I can get through for a while after the weekly outage but not this week. I'm hanging on here but I'm out of work. I hope I can get through before the upcoming deadlines (Oct.18) get me.

I read about using proxy servers but I think that it is bit beyond my capabilities. I'm using Mac OS X in several flavors so if anyone could point me to a step-by-step resource, I might try it. Thanks.

It's not hard to use...
Try Richard's post from the last Panic thread.


Wow! That was so easy. I had visions of terminal/command line arcane entries and changing all sorts of network settings. I'm not sure how I missed the posts you directed me to but I can't thank you enough for the help. Things are humming again.

Plus SETI Classic = 21,082 WUs
ID: 1161541 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1161588 - Posted: 12 Oct 2011, 18:24:11 UTC - in response to Message 1161541.  
Last modified: 12 Oct 2011, 19:20:57 UTC

Do you guys get AP WUs? I haven't received a single unit in the last 48 hours:(

[EDIT] I just have to rant online - and ... tataaa ... I get units :)
ID: 1161588 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1161589 - Posted: 12 Oct 2011, 18:26:19 UTC - in response to Message 1161541.  

No uploads, no downloads, no tasks able to report since Oct. 5. Usually I can get through for a while after the weekly outage but not this week. I'm hanging on here but I'm out of work. I hope I can get through before the upcoming deadlines (Oct.18) get me.

I read about using proxy servers but I think that it is bit beyond my capabilities. I'm using Mac OS X in several flavors so if anyone could point me to a step-by-step resource, I might try it. Thanks.

It's not hard to use...
Try Richard's post from the last Panic thread.


Wow! That was so easy. I had visions of terminal/command line arcane entries and changing all sorts of network settings. I'm not sure how I missed the posts you directed me to but I can't thank you enough for the help. Things are humming again.

You are most welcome!
So glad it solved your problem for now.
Be prepared to remove the proxy if the router eventually gets sorted, or try a different proxy in the interim if the one you are using becomes nonresponsive.

Meow!
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1161589 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1161603 - Posted: 12 Oct 2011, 19:02:58 UTC - in response to Message 1161588.  

Do you guys get AP WUs? I haven't received a single unit in the last 48 hours:(

Got some both today and yesterday.
ID: 1161603 · Report as offensive
MikeN

Send message
Joined: 24 Jan 11
Posts: 319
Credit: 64,719,409
RAC: 85
United Kingdom
Message 1161654 - Posted: 12 Oct 2011, 21:56:59 UTC - in response to Message 1161588.  

It seems APs are all I can get at present. I currently have 65 of them (46 in progress, 18 pending, 1 valid) spread over three machines. Usually I have less than 10 APs on CPUs set to crunch either MB or AP. On one of my machines they are causing chaos as they come in with predicted run times of 196 hours when they actually take ca 18 hours to crunch. I keep suspending (and then unsuspending 5 minutes later) all the MB work to force SETI to crunch the APs, thus reducing the predicted time remaining and allowing it to download more WUs, and lo and behold all I get is another AP with a 200 hour predicted run time!

Quite a few of the pending APs have actually been completed by me + wingman and are waiting for the validator to calculate the credit.
ID: 1161654 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22535
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1161665 - Posted: 12 Oct 2011, 22:10:17 UTC

Let the APs run and report. Gradually you will notice the estimated time getting closer to the real time as each one is reported.
It takes a couple of dozen of APs to get the estimate to be nearer the truth, so just be patient and the BOINC manager will sort it out.

Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1161665 · Report as offensive
Profile Dimly Lit Lightbulb 😀
Volunteer tester
Avatar

Send message
Joined: 30 Aug 08
Posts: 15399
Credit: 7,423,413
RAC: 1
United Kingdom
Message 1161684 - Posted: 12 Oct 2011, 23:06:37 UTC

@nmn17 Don't worry about the long completion times, they will take as long as previous tasks. The long completion times are due to a few server side fixes to some problems, they'll settle out (eg. get back to the times they actually get crunched) in no time. Just be patient and before you know it the times will be as they were before.
ID: 1161684 · Report as offensive
MikeN

Send message
Joined: 24 Jan 11
Posts: 319
Credit: 64,719,409
RAC: 85
United Kingdom
Message 1161808 - Posted: 13 Oct 2011, 7:18:45 UTC - in response to Message 1161684.  

@nmn17 Don't worry about the long completion times, they will take as long as previous tasks. The long completion times are due to a few server side fixes to some problems, they'll settle out (eg. get back to the times they actually get crunched) in no time. Just be patient and before you know it the times will be as they were before.


I know, but in the meantime I have few other tasks on this PC. I worry that with my luck, when the APs do finish after about 20 h, I will hit a server outage and not have any other work to get me through it. By forcing SETI to crunch the APs first, the estimated completion times come down sufficiently to allow more WU to be downloaded before they finish and I can keep a bunch of about 20 VLARs with completion dates much later than the APs as 'reserve' in case I need them. Left to its own devices, SETI would crunch on a first in first out basis and leave the APs to last with their predicted 200 h run times unchanged and preventing the download of any other WU's.
ID: 1161808 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22535
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1161811 - Posted: 13 Oct 2011, 7:33:05 UTC

Just let BOINC do its thing. It will sort itself out faster left to its own devices, and remember the last couple of Tuesday outages have only lasted a few hours, not the days of the unplanned events we've suffered over recent weekends.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1161811 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66355
Credit: 55,293,173
RAC: 49
United States
Message 1161920 - Posted: 13 Oct 2011, 17:11:28 UTC - in response to Message 1161684.  

@nmn17 Don't worry about the long completion times, they will take as long as previous tasks. The long completion times are due to a few server side fixes to some problems, they'll settle out (eg. get back to the times they actually get crunched) in no time. Just be patient and before you know it the times will be as they were before.

That explains what I'm seeing in Seti wu's at least, I thought It was something that I did. :D
Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 1161920 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1161944 - Posted: 13 Oct 2011, 18:51:20 UTC - in response to Message 1161943.  

OK, it's Thursday evening here, and less than 2 hours before the expected return of the by now infamous yellow creature into the bandwidth pond. Whoever has the creature this time, please reconsider your actions, and if possible do not set the yellow one free on us all, this close to the weekend.

LOL. Speaking as custodian of the yellow one of which you speak, I intend to be heading down the pub at about that time. If you ask very nicely, I may take him with me, instead of leaving him to his own devices ;-)
ID: 1161944 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1161947 - Posted: 13 Oct 2011, 19:01:02 UTC - in response to Message 1161946.  

OK, it's Thursday evening here, and less than 2 hours before the expected return of the by now infamous yellow creature into the bandwidth pond. Whoever has the creature this time, please reconsider your actions, and if possible do not set the yellow one free on us all, this close to the weekend.

LOL. Speaking as custodian of the yellow one of which you speak, I intend to be heading down the pub at about that time. If you ask very nicely, I may take him with me, instead of leaving him to his own devices ;-)

Please Richard, pretty please, take the yellow one with you and let him/her take a bath in a keg of beer, and not in our precious bandwidth pond.

Was that nice enough?

Sure, I'll take her to the pub then.

Though I'm not sure if I ought be held responsible for what she gets up to after a skinful of ale..... (it's quiz night tonight, so it gets quite lively in there)
ID: 1161947 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1161949 - Posted: 13 Oct 2011, 19:03:47 UTC

Isn't that cruelty to animals, to deny her a free swim?
ID: 1161949 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1161950 - Posted: 13 Oct 2011, 19:09:32 UTC

Prepare the duck!

I'm still getting a single AP every time I ask for work. There's only been two replies for "you do not have the stuff for MB yadda yadda yadda." The rest of the requests give me work. I think things are finally starting to stabilize.. aside from the AP validator.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1161950 · Report as offensive
AndyJ
Avatar

Send message
Joined: 17 Aug 02
Posts: 248
Credit: 27,380,797
RAC: 0
United Kingdom
Message 1161998 - Posted: 13 Oct 2011, 22:01:07 UTC - in response to Message 1161541.  
Last modified: 13 Oct 2011, 22:15:05 UTC

No uploads, no downloads, no tasks able to report since Oct. 5. Usually I can get through for a while after the weekly outage but not this week. I'm hanging on here but I'm out of work. I hope I can get through before the upcoming deadlines (Oct.18) get me.

I read about using proxy servers but I think that it is bit beyond my capabilities. I'm using Mac OS X in several flavors so if anyone could point me to a step-by-step resource, I might try it. Thanks.

It's not hard to use...
Try Richard's post from the last Panic thread.


Wow! That was so easy. I had visions of terminal/command line arcane entries and changing all sorts of network settings. I'm not sure how I missed the posts you directed me to but I can't thank you enough for the help. Things are humming again.


OK, been keeping it secret, and yes, I do know the issues with proxies, and if you dont, you do not want use proxy 64.71.138.95:80 (http)

: )

Fast, and stable for a week or so. Circumnavigated all HE problems here.

BUT
Your milage may vary.
Regards,
Andy
ID: 1161998 · Report as offensive
Wembley
Volunteer tester
Avatar

Send message
Joined: 16 Sep 09
Posts: 429
Credit: 1,844,293
RAC: 0
United States
Message 1162114 - Posted: 14 Oct 2011, 6:53:22 UTC

ID: 1162114 · Report as offensive
Rolf

Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 0
Switzerland
Message 1162264 - Posted: 14 Oct 2011, 18:45:34 UTC

What happened?
Uploads and downloads work like in the good old days!
No problems at all!
ID: 1162264 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1162274 - Posted: 14 Oct 2011, 19:19:23 UTC - in response to Message 1162267.  
Last modified: 14 Oct 2011, 19:19:57 UTC

What happened?
Uploads and downloads work like in the good old days!
No problems at all!


Now you jinxed the whole system. Prepare for a weekend without any new work units, or uploads, or reporting...

Crap. Thanks for nothing.

LOL

OH, CRAP.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1162274 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 12 · Next

Message boards : Number crunching : Panic Mode On (59) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.