Panic Mode On (83) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (83) Server Problems?

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 22 · Next
Author Message
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 6906
Credit: 25,758,109
RAC: 38,915
United Kingdom
Message 1358456 - Posted: 19 Apr 2013, 8:10:30 UTC

All task now reported and new downloaded! Now idea why, just lucky I guess
____________


Today is life, the only life we're sure of. Make the most of today.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38922
Credit: 578,689,355
RAC: 515,729
United States
Message 1358459 - Posted: 19 Apr 2013, 8:13:28 UTC - in response to Message 1358456.

All task now reported and new downloaded! Now idea why, just lucky I guess

Yeah...dicey right now. Servers are doing all they can with the dilithium crystals available.....
I have my best cruncher sucking GPU fumes right now, but the kitties shall keep sniffing for more WUs.

I got 9 rigs asking, so one or more of them is bound to hit Seti paydirt.

LOL...
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38922
Credit: 578,689,355
RAC: 515,729
United States
Message 1358464 - Posted: 19 Apr 2013, 8:28:56 UTC - in response to Message 1358454.
Last modified: 19 Apr 2013, 8:29:38 UTC

The limits are there to prevent issues from arising in the first place. I fail to see how raising them would help the project at all.

The project scope is clearly stated, to use spare cpu cycles that would otherwise be "wasted" running screensavers or other idle tasks. It was never intended to cope with dedicated computers or specially built crunching farms with multiple GPU's and demands for massive caches to sustain them running 24x7.

By constantly demanding that the limits be raised you are making your need to execute as many SETI tasks as possible more important than the well-being of the project itself.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


Alan...
Please don't try to remind some of us 'what the original scope of the project' was.
We know that. And we all (some of us, anyway,) know that the capacity of the project has thankfully grown far beyond the original concept.
No, in it's inception, it was never envisioned that vast numbers of like minded individuals would flock to the calling. But we have.
And the project has tried to it's best ability to respond.
They have done rather well, considering the meager funding...in most latter years, by it's own participants.

Matt has indicated that since the move to the colocation facility, we may well be outstripping the capacity to gather data by our massive ability to process it.

I see this as a great thing for the Seti project. And proof that the whole Boinc distributed computing concept has indeed come to fruition. We, as small users, have amassed such an enormous amount of computing power that the project might have some trouble supplying enough data for us to process......who would have dreamed?
Not Eric or Matt, I assure you.

As I have suggested, but am not in the direct loop right now, I suspect that since the relocation of the servers and the ability of the project to supply data to the crunching base, one of the upshots is that the GPUUG fundraisers upcoming might possibly be for a dedicated server to process the accumulated results.

The kitties would still like to get and process as much work as possible to further the initial goal of this project. Which is, of course, to prove that we are not alone.

I believe that every day we get closer to proving that reality.


Do you understand what I am saying, Alan?
I am not trying to put you down.
Just a little history from one who has been here a loooooong time.
And I know you have been, as well.

It's a success story, not a problem, as I see it.
Or perhaps a little of both.

Meow.
Mark.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5229
Credit: 285,023,813
RAC: 453,360
Brazil
Message 1358473 - Posted: 19 Apr 2013, 8:43:54 UTC - in response to Message 1358464.
Last modified: 19 Apr 2013, 8:45:04 UTC

The limits are there to prevent issues from arising in the first place. I fail to see how raising them would help the project at all.

The project scope is clearly stated, to use spare cpu cycles that would otherwise be "wasted" running screensavers or other idle tasks. It was never intended to cope with dedicated computers or specially built crunching farms with multiple GPU's and demands for massive caches to sustain them running 24x7.

By constantly demanding that the limits be raised you are making your need to execute as many SETI tasks as possible more important than the well-being of the project itself.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


I realy agree with Mark, and please don´t blame who have the called "crunching farms" for the last year SETI DB problems.

Another thing, i don´t belive an small rise in the limits (from 100 WU per GPU hosts to 100 WU per GPU) will "crash" the SETI servers, in numbers if we have 1000 hosts (i´m very optimistic, sure is a lot less) with 2xGPU they will increace the total WU handled by 100k, nothing compared with the 3 MM number allready waiting in db.

Nobody is asking for a 10 day´s cache, just a cache that could hold for few hours (maybe a 1/4 day) instead of minutes like the one we have now, think about that.
____________

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38922
Credit: 578,689,355
RAC: 515,729
United States
Message 1358477 - Posted: 19 Apr 2013, 9:00:05 UTC - in response to Message 1358473.
Last modified: 19 Apr 2013, 9:09:00 UTC

The limits are there to prevent issues from arising in the first place. I fail to see how raising them would help the project at all.

The project scope is clearly stated, to use spare cpu cycles that would otherwise be "wasted" running screensavers or other idle tasks. It was never intended to cope with dedicated computers or specially built crunching farms with multiple GPU's and demands for massive caches to sustain them running 24x7.

By constantly demanding that the limits be raised you are making your need to execute as many SETI tasks as possible more important than the well-being of the project itself.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


I realy agree with Mark, and please don´t blame who have the called "crunching farms" for the last year SETI DB problems.

Another thing, i don´t belive an small rise in the limits (from 100 WU per GPU hosts to 100 WU per GPU) will "crash" the SETI servers, in numbers if we have 1000 hosts (i´m very optimistic, sure is a lot less) with 2xGPU they will increace the total WU handled by 100k, nothing compared with the 3 MM number allready waiting in db.

Nobody is asking for a 10 day´s cache, just a cache that could hold for few hours (maybe a 1/4 day) instead of minutes like the one we have now, think about that.

I don't have all the answers......
But, let me tell you this much.

I am holding my own with mostly 2-3 year old hardware. I pretty much know how to make it run. If I had the funds to go balls out......I could take on the best.
And.....please let this be known, because I am proud of it.....
I have done this 'at home' with my own funds, on my own power bill.
Not by running on a business's power or a bunch of school computers.

My own. Here in my home. Nothing but.

I have survived with the kind donations from a number of very kind souls, going back a few years, that have donated to me more than a few motherboards, a few PSUs, and many more than a few GPUs. I have just received a donation of more than a few 580s......
Which, as I have quipped before, might be trash to some, but gold to the kitties.

I can't afford to be cutting edge, my last power bill just arrived....
Over $550.00....for running all 9 rigs 24/7.

I have another donated mobo that I have apologized to the donor that I have not yet fully set up to run.......because I cannot afford to pay more for the juice to run another rig.

Y'all have been so kind.

Just know the kittyman is doing all he can for the project, and will continue to do so all long as he can.


More to respond to me?
My PM box is open to all....................
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38922
Credit: 578,689,355
RAC: 515,729
United States
Message 1358478 - Posted: 19 Apr 2013, 9:08:02 UTC

And the kitties have regained most of their caches.

Meow!!!
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5229
Credit: 285,023,813
RAC: 453,360
Brazil
Message 1358479 - Posted: 19 Apr 2013, 9:10:28 UTC

Mark.

We all know how much you love and all you done for the success of this project.

Go kitties!

____________

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38922
Credit: 578,689,355
RAC: 515,729
United States
Message 1358481 - Posted: 19 Apr 2013, 9:13:40 UTC - in response to Message 1358479.

Mark.

We all know how much you love and all you done for the success of this project.

Go kitties!

Well, it's all legend now...LOL.
Some do not appreciate it when I make some grandiose statements.
But when I do, I have been there, and done that.

I do not posture much. When I say I have been there and done that, I have done it.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Andy Williams
Volunteer tester
Avatar
Send message
Joined: 11 May 01
Posts: 187
Credit: 112,464,820
RAC: 0
United States
Message 1358535 - Posted: 19 Apr 2013, 11:37:28 UTC - in response to Message 1358477.

I can't afford to be cutting edge, my last power bill just arrived....
Over $550.00....for running all 9 rigs 24/7.


Wow, cheap. When I was running SETI flat out, many GPUs, >200,000 RAC my electric bill was over $1000 per month. That was years ago though with less efficient GPUs.
____________
--
Classic 82353 WU / 400979 h

Profile Vipin Palazhi
Avatar
Send message
Joined: 29 Feb 08
Posts: 247
Credit: 103,859,693
RAC: 29,123
India
Message 1358596 - Posted: 19 Apr 2013, 16:56:55 UTC

Am I glad that I am currently living in this part of the world. Electricity is included in the apartment rent, no matter how much I use. I pray it stays that way for a looooooong time.
______________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4080
Credit: 111,692,840
RAC: 146,844
United States
Message 1358602 - Posted: 19 Apr 2013, 17:12:03 UTC - in response to Message 1358481.

Mark.

We all know how much you love and all you done for the success of this project.

Go kitties!

Well, it's all legend now...LOL.
Some do not appreciate it when I make some grandiose statements.
But when I do, I have been there, and done that.

I do not posture much. When I say I have been there and done that, I have done it.

Been there, done that, still saving up for the toaster... :D

____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,028,037
RAC: 48,169
Australia
Message 1358669 - Posted: 19 Apr 2013, 21:27:17 UTC - in response to Message 1358602.


Still struggling to get work, splitters till not cranking up to match demand.
____________
Grant
Darwin NT.

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 224,298,225
RAC: 216,163
Australia
Message 1358679 - Posted: 19 Apr 2013, 21:57:34 UTC - in response to Message 1358669.


It is just a pity that we do not have caches ... it might smooth out some of the unhappiness that's around.
____________

Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 6906
Credit: 25,758,109
RAC: 38,915
United Kingdom
Message 1358682 - Posted: 19 Apr 2013, 22:01:01 UTC

All good here 5 machines with max tasks.
____________


Today is life, the only life we're sure of. Make the most of today.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,028,037
RAC: 48,169
Australia
Message 1358700 - Posted: 19 Apr 2013, 22:50:26 UTC - in response to Message 1358682.

All good here 5 machines with max tasks.

Down to 362 here of 400.
Every now & then i finally get some work which bumps it up, then dozens of "No tasks available" responses to work requests till the next bump up. But still unable to get the full 400 limit.
____________
Grant
Darwin NT.

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 224,298,225
RAC: 216,163
Australia
Message 1358765 - Posted: 20 Apr 2013, 3:10:41 UTC - in response to Message 1358700.

Results Ready to Send ... 0




____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,654,737
RAC: 1
United States
Message 1358768 - Posted: 20 Apr 2013, 3:38:25 UTC - in response to Message 1358446.
Last modified: 20 Apr 2013, 3:38:58 UTC

The limits are there to prevent issues from arising in the first place. I fail to see how raising them would help the project at all.

The project scope is clearly stated, to use spare cpu cycles that would otherwise be "wasted" running screensavers or other idle tasks. It was never intended to cope with dedicated computers or specially built crunching farms with multiple GPU's and demands for massive caches to sustain them running 24x7.

By constantly demanding that the limits be raised you are making your need to execute as many SETI tasks as possible more important than the well-being of the project itself you can use with those that stay.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


i'm for 10 time larger MB tasks.

IF that "spare cpu cycle" theory was real then get rid of RAC and Credit, and don't keep score. see how many stick around, and then that will be a valid argument.
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,028,037
RAC: 48,169
Australia
Message 1358769 - Posted: 20 Apr 2013, 3:41:28 UTC - in response to Message 1358765.

Results Ready to Send ... 0

It's been that way for almost 2 days, the splitters can't keep up with demand so the Ready-to-send buffer has shrunk down to 0.
Hence the multitude of "No tasks sent" messages, with the odd request resulting in work.

There also appears to have been a problem with downloads- traffic dropped off to bugger all for a while there, and i eneded up with all of my downloads in backoff mode. Hit re-rty a couple of times & they finally came through.
____________
Grant
Darwin NT.

ExchangeMan
Volunteer tester
Send message
Joined: 9 Jan 00
Posts: 108
Credit: 131,970,457
RAC: 212,235
United States
Message 1358770 - Posted: 20 Apr 2013, 3:46:47 UTC - in response to Message 1358769.

Results Ready to Send ... 0

It's been that way for almost 2 days, the splitters can't keep up with demand so the Ready-to-send buffer has shrunk down to 0.
Hence the multitude of "No tasks sent" messages, with the odd request resulting in work.

There also appears to have been a problem with downloads- traffic dropped off to bugger all for a while there, and i eneded up with all of my downloads in backoff mode. Hit re-rty a couple of times & they finally came through.

I've seen this odd download behavior also.

____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 58,028,037
RAC: 48,169
Australia
Message 1358771 - Posted: 20 Apr 2013, 3:50:48 UTC - in response to Message 1358768.

The fix to this problem is larger workunits which will reduce the number of "in progress" entries in the database.


i'm for 10 time larger MB tasks.

The problem is that's a kludge, not a fix.
Several times over the years they're upped the sensitivity of the application, resulting in greatly increased processing times. But the fact is CPU & now GPUs continue to improve their performance at the same increadible rate they've been doing so for years.

The other option is more capable hardware on the server side, but once again that's not fixing the problem, just working around it.


The problem is the databse & the only real fix will probably be to re-design it from scratch.

One database for the in progress work- Waiting to be sent WUs, WUs returned waiting for valiadtion etc & another for the the work that has been completed & validated. The second database will continue to grow over time, the first will only ever be as as large as the amount of work that is in progress. That will allow people to have large caches, and the Scheduler & the databse won't be stuggling with the huge number of entires it needs to manage at present.
____________
Grant
Darwin NT.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (83) Server Problems?

Copyright © 2014 University of California