Panic Mode On (30) Server problems

Message boards : Number crunching : Panic Mode On (30) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 978296 - Posted: 13 Mar 2010, 19:00:30 UTC - in response to Message 978293.  
Last modified: 13 Mar 2010, 19:00:48 UTC

I never run out of work. I even still have a Seti and Seti Beta WU that are only partially done. Other the Seti, I have three projects currently not supplying work, plus three dead projects on my list. Boinc is working precisely as the developers intended.

Except we are on the Seti board not the Boinc board...Some of us choose to only run seti and it gives us a chance to whine and blow steam. It is really quite harmless but also a part of the Hobby....Eric would probably get bored if everything always worked and the boards would be dead. LOL! Back to B*tchin...
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 978296 · Report as offensive
nemesis
Avatar

Send message
Joined: 12 Oct 99
Posts: 1408
Credit: 35,074,350
RAC: 0
Message 978299 - Posted: 13 Mar 2010, 19:07:01 UTC
Last modified: 13 Mar 2010, 19:08:26 UTC

i'm here for the free coffee.
and flavored half and half..

(Hiamps...your RAC is higher than mine.
Stop that!)
ID: 978299 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 978302 - Posted: 13 Mar 2010, 19:14:04 UTC

Getting problems with uploads/downloads/reporting now again.
ID: 978302 · Report as offensive
Profile AndyW Project Donor
Volunteer tester
Avatar

Send message
Joined: 23 Oct 02
Posts: 5862
Credit: 10,957,677
RAC: 18
United Kingdom
Message 978314 - Posted: 13 Mar 2010, 19:48:50 UTC - in response to Message 978264.  



It's sad because apparently Boinc/Seti cannot function properly without constant onsite human intervention to keep it online.


Hmm, a bt harsh - and wrong, IMO. My BOINC is still working - it's still crunching SETI WUs, which is precisely what it us supposed to be doing. The SETI servers are flaky, yes - but that is PRECISELY why we have BOINC to look after the flow of work - to & from our home PCs. That way SETI doesn't rely on state of the art servers - on it's VERY limited budget.

ID: 978314 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 978334 - Posted: 13 Mar 2010, 20:46:19 UTC

Signs of life....
3/13/2010 12:43:28 PM SETI@home Finished upload of 17no09ac.22446.14382.5.10.156_1_0

Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 978334 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 978335 - Posted: 13 Mar 2010, 20:48:33 UTC - in response to Message 978299.  

i'm here for the free coffee.
and flavored half and half..

(Hiamps...your RAC is higher than mine.
Stop that!)

Thats what happens with the summer farm, I will probably Almost catch up by fall and fall behind again, unless you get new A/C...Oh oh....
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 978335 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 14041
Credit: 208,696,464
RAC: 304
Australia
Message 978347 - Posted: 13 Mar 2010, 21:37:08 UTC - in response to Message 978335.  


The upload problem is certainly back with a vengance. Allmost all of the uploads time out within a second of starting. A few start to upload, but time out after 10-90 seconds.
With over a dozen queued for upload & over 20 minutes of clicking on Retry Now, only 2 uploaded.
Grant
Darwin NT
ID: 978347 · Report as offensive
nemesis
Avatar

Send message
Joined: 12 Oct 99
Posts: 1408
Credit: 35,074,350
RAC: 0
Message 978351 - Posted: 13 Mar 2010, 21:44:18 UTC - in response to Message 978335.  

"Thats what happens with the summer farm, I will probably Almost catch up by fall and fall behind again, unless you get new A/C...Oh oh...."

me? new ac?
no
but the summer farm is bigger
and badder than before....

ID: 978351 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51592
Credit: 1,018,363,574
RAC: 1,004
United States
Message 978589 - Posted: 14 Mar 2010, 13:17:03 UTC - in response to Message 978351.  

"Thats what happens with the summer farm, I will probably Almost catch up by fall and fall behind again, unless you get new A/C...Oh oh...."

me? new ac?
no
but the summer farm is bigger
and badder than before....

Still pounding away here........
But nothing getting back to home.
Sad? Not really.
Wish I could win that lottery and give the Seti server closet the good cleaning it needs.....sigh.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 978589 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 978676 - Posted: 14 Mar 2010, 16:21:28 UTC

31 uploads waiting to go out and growing...
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 978676 · Report as offensive
Profile rebest Project Donor
Volunteer tester
Avatar

Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 45,357,093
RAC: 0
United States
Message 978682 - Posted: 14 Mar 2010, 16:39:45 UTC

The server status page says everything is just fine.

So... nothing is working! :)

Crunchin' the cache.




Join the PACK!
ID: 978682 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 978694 - Posted: 14 Mar 2010, 17:18:44 UTC

I'm getting downloads now so that isn't a problem for me, yet.

Looking at the Cricket graphs it appears that uploads went wonky about 30 hours ago. Same thing with Scarecrows results received graphs. Currently 10-20% of "normal" uploads per hour.

The system may have started to degrade a day or so before it collapsed to current levels. Hopefully it won't be another extended case of "she canna take it Capt'n".
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 978694 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51592
Credit: 1,018,363,574
RAC: 1,004
United States
Message 978706 - Posted: 14 Mar 2010, 18:22:31 UTC
Last modified: 14 Mar 2010, 18:25:18 UTC

My 920 rig just ran out of Cuda work, and as there are 2300 or so VfingLAR's on the CPU, nothing left to reschedule here.

The Frozen One has a couple of day's worth yet, and the 3rd Cuda has one or two.

Hope this doesn't take days to fix like the last outage.

Another hack attack?
Or did some clown run another rogue script again?
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 978706 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 978731 - Posted: 14 Mar 2010, 20:16:45 UTC - in response to Message 978706.  

My 920 rig just ran out of Cuda work, and as there are 2300 or so VfingLAR's on the CPU, nothing left to reschedule here.

Thought you said you had 10 days worth for all the machines --- it's not really 10 days worth if they all don't have the right ratio ;-) Even on good days, the potential to have nothing but "V-to-the-f-Lars" exists, so I keep my ReScheduler slider over at 75% to compensate for the feast/famine nature of the tapes being split.

That the upload path is busted just amplifies things, but that could be weathered if there wasn't an artificial block on downloads. If this drags again, I'll take the half day hit to figure out how to set up a private boinc build absent that one line of code (greedy I know, but history seems to be repeating itself awfully quickly).
ID: 978731 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1649
Credit: 12,921,799
RAC: 89
New Zealand
Message 978733 - Posted: 14 Mar 2010, 20:18:11 UTC

Since about 9 UTC the work units waiting assimilation has decreased by 111335
This comes as a bit of a surprise to since there has a low turn over of results returned. I have five tasks waiting to upload, I have another 17 in my cache.
ID: 978733 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51592
Credit: 1,018,363,574
RAC: 1,004
United States
Message 978735 - Posted: 14 Mar 2010, 20:26:34 UTC - in response to Message 978731.  

My 920 rig just ran out of Cuda work, and as there are 2300 or so VfingLAR's on the CPU, nothing left to reschedule here.

Thought you said you had 10 days worth for all the machines --- it's not really 10 days worth if they all don't have the right ratio ;-) Even on good days, the potential to have nothing but "V-to-the-f-Lars" exists, so I keep my ReScheduler slider over at 75% to compensate for the feast/famine nature of the tapes being split.

That the upload path is busted just amplifies things, but that could be weathered if there wasn't an artificial block on downloads. If this drags again, I'll take the half day hit to figure out how to set up a private boinc build absent that one line of code (greedy I know, but history seems to be repeating itself awfully quickly).

10 day's worth includes the CPU....
The ratio of VLAR work being issued over the last few weeks has been rather high, so you can't reschedule what you can't get.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 978735 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 978747 - Posted: 14 Mar 2010, 21:09:49 UTC - in response to Message 978735.  

The ratio of VLAR work being issued over the last few weeks has been rather high, so you can't reschedule what you can't get.

I totally hear ya. It's that reason a few weeks back I would occassionally post a ratio of REG/VLAR/VHAR/AP. No one much noticed, but it was a valuable data point for me after I had gotten burned with no cuda work left, so I tried to share. I'd put my toe in the water to sample the ratio with about 200 WU's, and if it was favorable, I'd open the floodgates, and if not, I try again a day later assuming I saw new tapes go live (while keeping my fingers crossed that the project wouldn't pick that day to keel over completely).

Given that approach above, my ratio is still about 60/40 this afternoon (and I can always rebalance back to CPU if necessary), hence I'm not feeling the same pain (yes, I know you eat way more than I do, but I crunch enough work that I beleive the same method would scale). The downside is I have to remember to run re-scheduler about every other day (to balance a 3 day CPU queue and an artifically loaded 12 day queue for the GPU). Yes, this only works for me because we seldom get 7 days of nothing by VLARs, but all these techniques are really little more than increasing personal breathing room for times when the project itself is off balance somehow, and pretty much all you can buy yourself is about half the distance of shortest issued WU's (which is 2 weeks, thus 1 week of air).

Would be a ton easier and far more timely if the Server Status page just gave the % as part of the 'Ready to send' figure (and it could be argued if the cuda code were revisited, maybe there'd be no need for rescheduler in the first place).
ID: 978747 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51592
Credit: 1,018,363,574
RAC: 1,004
United States
Message 978755 - Posted: 14 Mar 2010, 21:34:29 UTC - in response to Message 978747.  


Would be a ton easier and far more timely if the Server Status page just gave the % as part of the 'Ready to send' figure (and it could be argued if the cuda code were revisited, maybe there'd be no need for rescheduler in the first place).

Well, we all know that isn't going to happen, eh?

For now, I'd be happy if I could just report what work I do have done.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 978755 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 978794 - Posted: 14 Mar 2010, 23:25:11 UTC - in response to Message 978747.  

Would be a ton easier and far more timely if the Server Status page just gave the % as part of the 'Ready to send' figure (and it could be argued if the cuda code were revisited, maybe there'd be no need for rescheduler in the first place).

But there is only one queue RTS. At that point there is no knowledge of whether they will be crunched by a CPU or a GPU, they are the same WU's.

F.
ID: 978794 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 978958 - Posted: 15 Mar 2010, 13:08:08 UTC

Was there any definitive reason given for the last time the uploads were backlogged a couple weeks ago?
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 978958 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 · Next

Message boards : Number crunching : Panic Mode On (30) Server problems


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.