Panic Mode On (83) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (83) Server Problems?

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 22 · Next
Author Message
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 7081
Credit: 27,535,910
RAC: 36,091
United Kingdom
Message 1358456 - Posted: 19 Apr 2013, 8:10:30 UTC

All task now reported and new downloaded! Now idea why, just lucky I guess
____________


Today is life, the only life we're sure of. Make the most of today.

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5413
Credit: 306,468,853
RAC: 328,237
Brazil
Message 1358473 - Posted: 19 Apr 2013, 8:43:54 UTC - in response to Message 1358464.
Last modified: 19 Apr 2013, 8:45:04 UTC

The limits are there to prevent issues from arising in the first place. I fail to see how raising them would help the project at all.

The project scope is clearly stated, to use spare cpu cycles that would otherwise be "wasted" running screensavers or other idle tasks. It was never intended to cope with dedicated computers or specially built crunching farms with multiple GPU's and demands for massive caches to sustain them running 24x7.

By constantly demanding that the limits be raised you are making your need to execute as many SETI tasks as possible more important than the well-being of the project itself.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


I realy agree with Mark, and please don´t blame who have the called "crunching farms" for the last year SETI DB problems.

Another thing, i don´t belive an small rise in the limits (from 100 WU per GPU hosts to 100 WU per GPU) will "crash" the SETI servers, in numbers if we have 1000 hosts (i´m very optimistic, sure is a lot less) with 2xGPU they will increace the total WU handled by 100k, nothing compared with the 3 MM number allready waiting in db.

Nobody is asking for a 10 day´s cache, just a cache that could hold for few hours (maybe a 1/4 day) instead of minutes like the one we have now, think about that.
____________

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5413
Credit: 306,468,853
RAC: 328,237
Brazil
Message 1358479 - Posted: 19 Apr 2013, 9:10:28 UTC

Mark.

We all know how much you love and all you done for the success of this project.

Go kitties!

____________

Andy Williams
Volunteer tester
Avatar
Send message
Joined: 11 May 01
Posts: 187
Credit: 112,464,820
RAC: 0
United States
Message 1358535 - Posted: 19 Apr 2013, 11:37:28 UTC - in response to Message 1358477.

I can't afford to be cutting edge, my last power bill just arrived....
Over $550.00....for running all 9 rigs 24/7.


Wow, cheap. When I was running SETI flat out, many GPUs, >200,000 RAC my electric bill was over $1000 per month. That was years ago though with less efficient GPUs.
____________
--
Classic 82353 WU / 400979 h

Profile Vipin Palazhi
Avatar
Send message
Joined: 29 Feb 08
Posts: 249
Credit: 107,281,889
RAC: 74,857
India
Message 1358596 - Posted: 19 Apr 2013, 16:56:55 UTC

Am I glad that I am currently living in this part of the world. Electricity is included in the apartment rent, no matter how much I use. I pray it stays that way for a looooooong time.
______________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4443
Credit: 119,137,444
RAC: 140,495
United States
Message 1358602 - Posted: 19 Apr 2013, 17:12:03 UTC - in response to Message 1358481.

Mark.

We all know how much you love and all you done for the success of this project.

Go kitties!

Well, it's all legend now...LOL.
Some do not appreciate it when I make some grandiose statements.
But when I do, I have been there, and done that.

I do not posture much. When I say I have been there and done that, I have done it.

Been there, done that, still saving up for the toaster... :D

____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,598,482
RAC: 47,541
Australia
Message 1358669 - Posted: 19 Apr 2013, 21:27:17 UTC - in response to Message 1358602.


Still struggling to get work, splitters till not cranking up to match demand.
____________
Grant
Darwin NT.

Lionel
Send message
Joined: 25 Mar 00
Posts: 576
Credit: 236,010,965
RAC: 230,198
Australia
Message 1358679 - Posted: 19 Apr 2013, 21:57:34 UTC - in response to Message 1358669.


It is just a pity that we do not have caches ... it might smooth out some of the unhappiness that's around.
____________

Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 7081
Credit: 27,535,910
RAC: 36,091
United Kingdom
Message 1358682 - Posted: 19 Apr 2013, 22:01:01 UTC

All good here 5 machines with max tasks.
____________


Today is life, the only life we're sure of. Make the most of today.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,598,482
RAC: 47,541
Australia
Message 1358700 - Posted: 19 Apr 2013, 22:50:26 UTC - in response to Message 1358682.

All good here 5 machines with max tasks.

Down to 362 here of 400.
Every now & then i finally get some work which bumps it up, then dozens of "No tasks available" responses to work requests till the next bump up. But still unable to get the full 400 limit.
____________
Grant
Darwin NT.

Lionel
Send message
Joined: 25 Mar 00
Posts: 576
Credit: 236,010,965
RAC: 230,198
Australia
Message 1358765 - Posted: 20 Apr 2013, 3:10:41 UTC - in response to Message 1358700.

Results Ready to Send ... 0




____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,654,737
RAC: 0
United States
Message 1358768 - Posted: 20 Apr 2013, 3:38:25 UTC - in response to Message 1358446.
Last modified: 20 Apr 2013, 3:38:58 UTC

The limits are there to prevent issues from arising in the first place. I fail to see how raising them would help the project at all.

The project scope is clearly stated, to use spare cpu cycles that would otherwise be "wasted" running screensavers or other idle tasks. It was never intended to cope with dedicated computers or specially built crunching farms with multiple GPU's and demands for massive caches to sustain them running 24x7.

By constantly demanding that the limits be raised you are making your need to execute as many SETI tasks as possible more important than the well-being of the project itself you can use with those that stay.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


i'm for 10 time larger MB tasks.

IF that "spare cpu cycle" theory was real then get rid of RAC and Credit, and don't keep score. see how many stick around, and then that will be a valid argument.
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,598,482
RAC: 47,541
Australia
Message 1358769 - Posted: 20 Apr 2013, 3:41:28 UTC - in response to Message 1358765.

Results Ready to Send ... 0

It's been that way for almost 2 days, the splitters can't keep up with demand so the Ready-to-send buffer has shrunk down to 0.
Hence the multitude of "No tasks sent" messages, with the odd request resulting in work.

There also appears to have been a problem with downloads- traffic dropped off to bugger all for a while there, and i eneded up with all of my downloads in backoff mode. Hit re-rty a couple of times & they finally came through.
____________
Grant
Darwin NT.

ExchangeMan
Volunteer tester
Send message
Joined: 9 Jan 00
Posts: 113
Credit: 143,348,532
RAC: 202,235
United States
Message 1358770 - Posted: 20 Apr 2013, 3:46:47 UTC - in response to Message 1358769.

Results Ready to Send ... 0

It's been that way for almost 2 days, the splitters can't keep up with demand so the Ready-to-send buffer has shrunk down to 0.
Hence the multitude of "No tasks sent" messages, with the odd request resulting in work.

There also appears to have been a problem with downloads- traffic dropped off to bugger all for a while there, and i eneded up with all of my downloads in backoff mode. Hit re-rty a couple of times & they finally came through.

I've seen this odd download behavior also.

____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,598,482
RAC: 47,541
Australia
Message 1358771 - Posted: 20 Apr 2013, 3:50:48 UTC - in response to Message 1358768.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


i'm for 10 time larger MB tasks.

The problem is that's a kludge, not a fix.
Several times over the years they're upped the sensitivity of the application, resulting in greatly increased processing times. But the fact is CPU & now GPUs continue to improve their performance at the same increadible rate they've been doing so for years.

The other option is more capable hardware on the server side, but once again that's not fixing the problem, just working around it.


The problem is the databse & the only real fix will probably be to re-design it from scratch.

One database for the in progress work- Waiting to be sent WUs, WUs returned waiting for valiadtion etc & another for the the work that has been completed & validated. The second database will continue to grow over time, the first will only ever be as as large as the amount of work that is in progress. That will allow people to have large caches, and the Scheduler & the databse won't be stuggling with the huge number of entires it needs to manage at present.
____________
Grant
Darwin NT.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (83) Server Problems?

Copyright © 2014 University of California