Panic Mode On (105) Server Problems?

Message boards : Number crunching : Panic Mode On (105) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 34 · Next

AuthorMessage
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1857204 - Posted: 23 Mar 2017, 12:44:28 UTC - in response to Message 1857185.  

The project has NEVER promised to keep us fed with work, they have ALWAYS told us to have standby projects to fall back onto in the event of (extended) outages.

But it would be nice if it were possible to carry at least 24 hours of work, even 48. The option is there in the settings- it would be appreciated if it were possible.
Why not raise the server side limits, but restrict all users to a maximum of 2 days cache, leaving more work available to faster crunchers to see them through these outages, but still limiting the load on the database?

If there is no data available to crunch, then there's no work.
But If there is work available, but the project has issues, it would be nice to be able to continue processing it.

It has been said that there will (eventually) be way more work than the present user base can process in a reasonable time, and more crunchers are needed. But it doesn't matter how many crunchers there are, if they can't process the work. And even if the servers spend most of a week down, if in their uptime people are able to get enough work to tide them through to the next uptime, people will continue to crunch.

Being able to have a 24 cache of SETI@home work would be nice.
I don't think a project could set a day cache max. Given that is a BOINC wide setting instead of a project setting. I don't doubt they would implement such a setting, but with the lack of BOINC development I don't expect anything like that to happen soon.
Doubling the current 100 CPU / 100 GPU*N limit would probably not be a great idea. Given we have been in the 5-6 million range of in progress fairly recently. Previously the db server was falling over when we were hitting ~10-11 million. I believe it was stated at the time that the server wasn't hitting a hardware limitation, but more of a software or database limitation.
At one time it was mentioned they were considering switching from the current infomatrix software, but that was some time ago. With Nebula I'm not sure how priorities may be heading now.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1857204 · Report as offensive
Mark Stevenson Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 8 Sep 11
Posts: 1736
Credit: 174,899,165
RAC: 91
United Kingdom
Message 1857209 - Posted: 23 Mar 2017, 13:26:55 UTC - in response to Message 1857204.  
Last modified: 23 Mar 2017, 13:28:03 UTC

Doubling the current 100 CPU / 100 GPU*N limit would probably not be a great idea. Given we have been in the 5-6 million range of in progress fairly recently. Previously the db server was falling over when we were hitting ~10-11 million. I believe it was stated at the time that the server wasn't hitting a hardware limitation, but more of a software or database limitation.


ALL of my machines ran out of work this week , so what , raising the " limits" people would STILL moan on about it whatever they were . I ain't got backup projects coz i DON'T want any if i did i'd sign up to other projects .
Finished proper work saw the computers had no wu's so used the time to get the airline to them and blow out the "dust bunnies" properly .

If running out of work is that much of a problem SEE YOUR FAMILY DR / GP and GET SOME HELP ( medication !! ) COZ YOU NEED IT !!!
the world DON'T stop coz seti@home aint sendin out wu's for a few hours
Life is what you make of it :-)

When i'm good i'm very good , but when i'm bad i'm shi#eloads better ;-) In't I " buttercups " p.m.s.l at authoritie !!;-)
ID: 1857209 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1857224 - Posted: 23 Mar 2017, 15:22:13 UTC
Last modified: 23 Mar 2017, 15:22:26 UTC

Max WUs should be tied to RAC. 300 000 -> 3000 WUs.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1857224 · Report as offensive
BetelgeuseFive Project Donor
Volunteer tester

Send message
Joined: 6 Jul 99
Posts: 158
Credit: 17,117,787
RAC: 19
Netherlands
Message 1857230 - Posted: 23 Mar 2017, 16:30:56 UTC - in response to Message 1857224.  

Max WUs should be tied to RAC. 300 000 -> 3000 WUs.


I agree that some relation between RAC and maximum number of workunits would be an improvement. However, to prevent the number of tasks that are out in the field from reaching all time highs, I think something else is needed. IMHO the high number of results out in the field is not caused by the top crunchers who usually return their results within 24 hours, but by the results that are returned (too) late. I would suggest to dramatically reduce the task deadlines (for some tasks currently over 2 months) to something like 2 or 3 weeks max. Even on relatively slow hardware it should be possible to return results in that timeframe.

Just my 2 cents ...

Tom
ID: 1857230 · Report as offensive
Profile IntenseGuy

Send message
Joined: 25 Sep 00
Posts: 190
Credit: 23,498,825
RAC: 9
United States
Message 1857233 - Posted: 23 Mar 2017, 16:39:21 UTC - in response to Message 1857230.  

I think the "real" solution would be limiting the downtime. Einstein project never seems to require downtime for backups - why does Seti?
ID: 1857233 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1857234 - Posted: 23 Mar 2017, 16:39:29 UTC

Max WU = the lesser of 500 or 500/TurnAroundTime

Iif you don't send them back for 10 days, you get 50, 500 if you're fast.
ID: 1857234 · Report as offensive
Mark Stevenson Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 8 Sep 11
Posts: 1736
Credit: 174,899,165
RAC: 91
United Kingdom
Message 1857237 - Posted: 23 Mar 2017, 17:19:35 UTC - in response to Message 1857224.  

Max WUs should be tied to RAC. 300 000 -> 3000 WUs.


What's in place at the moment might not be " perfect " but would think the actual people in the lab had a disscusion about what the limits should be before they spent time coding it into the system and implimenting it .
It's been this way for a few years now but least it sorta works and the same rules should apply to everone .

There are ways to " scheme the systyem" but 99% of people will call them out as a -wipes ( and have done just do a bit of reading ) .
If RAC is SO IMPORTANT to you there's other projects that " pay" loads better why don't you crunch for them . I crunch seti coz i belive in it's objective and probably the same for > 97% of people .
Life is what you make of it :-)

When i'm good i'm very good , but when i'm bad i'm shi#eloads better ;-) In't I " buttercups " p.m.s.l at authoritie !!;-)
ID: 1857237 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1857238 - Posted: 23 Mar 2017, 17:27:01 UTC - in response to Message 1857234.  

Max WU = the lesser of 500 or 500/TurnAroundTime

Iif you don't send them back for 10 days, you get 50, 500 if you're fast.


. . Tying allocation numbers to return time does make sense. It is not just the number of WUs sent to the field that enlarges the database, but a big factor is the time they remain in the field. Each WU is replicated twice for initial issue to the field with one copy remaining on the server. So on a big hitter like Petri's machine crunching let's say 2000 per day, with most returned and, presuming the wingmen are as efficient, cleared in a day, taking up a total 6000 slot days in the database. Seems a lot. Now compare that to the effect of one dud host with a cache of say 200 WUs sitting in limbo for 6 to 8 weeks and not being processed. By the time they time out they will have taken up 42 x 3 slot days (6 week deadline) per WU, taking up 42 x 3 x200 or 25,200 slot days in total. With Petri's example the machine will occupy approx. 6000 slot days in the database consistently over time but represents productive use of database space.
The dud host will only occupy 600 per day but the slot/days per WU will be multiplied by the number of days they sit there and is non-productive. And when they do time out and go to a third host it becomes 4 slots per WU per day until it is cleared. And most of us know this can happen two or three times before some WUs are finally cleared. Measures to restrict this impact could have a quite significant effect on database size, and most importantly reduce database space that is simply non-productive overhead. Shorter deadlines can go part way to eliminating at least some of that waste space.

Stephen
ID: 1857238 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1857242 - Posted: 23 Mar 2017, 17:49:25 UTC - in response to Message 1857238.  

A succinct analysis, Stephen. I agree that a reduced deadline would almost immediately have a huge impact on database size and performance. But do we really think we can change the mindset of the project scientists that SETI needs to be anything other than a screensaver app from the late 90's that uses your spare CPU cycles when you aren't using your computer? The project is still tuned to the lowest common denominator of PC hardware.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1857242 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1857244 - Posted: 23 Mar 2017, 18:16:14 UTC - in response to Message 1857230.  

I agree that some relation between RAC and maximum number of workunits would be an improvement. However, to prevent the number of tasks that are out in the field from reaching all time highs, I think something else is needed. IMHO the high number of results out in the field is not caused by the top crunchers who usually return their results within 24 hours, but by the results that are returned (too) late. I would suggest to dramatically reduce the task deadlines (for some tasks currently over 2 months) to something like 2 or 3 weeks max. Even on relatively slow hardware it should be possible to return results in that timeframe.

No need to reduce deadlines (and make a lot of older, and new ARM hardware) too slow to crunch for the project.
Just make a top limit of 2 days; instead of those with slower hardware who can carry 10+ days of work and faster systems running out in a matter of hours then slow systems still get to crunch, faster systems spend less time idle and the database remains (mostly) manageable.
Grant
Darwin NT
ID: 1857244 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22535
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1857247 - Posted: 23 Mar 2017, 18:31:49 UTC

As one with a top twenty cruncher, which most of the contributors to this thread do not have, I can honestly say - "You greedy bunch of twonks". I live with the limits, the fact is that most weekly outages at least two of my crunchers run out of work. The key thing is what I said earlier, and this is repeating the long stated policy of SETI@Home, that there is NEVER going to be a guarantee of work from the project, be it for a short period of time, or over a longer period. So I would suggest if you don't like that go to another project and donate your time to that one. Otherwise stop moaning.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1857247 · Report as offensive
Mark Stevenson Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 8 Sep 11
Posts: 1736
Credit: 174,899,165
RAC: 91
United Kingdom
Message 1857253 - Posted: 23 Mar 2017, 19:47:04 UTC - in response to Message 1857252.  

As one with a top twenty cruncher, which most of the contributors to this thread do not have, I can honestly say - "You greedy bunch of twonks". I live with the limits, the fact is that most weekly outages at least two of my crunchers run out of work. The key thing is what I said earlier, and this is repeating the long stated policy of SETI@Home, that there is NEVER going to be a guarantee of work from the project, be it for a short period of time, or over a longer period. So I would suggest if you don't like that go to another project and donate your time to that one. Otherwise stop moaning.



Yup, extremely well said.


+ 1 to both posts , and i don't care what anybody thinks or posts in responce to this post !!!
Life is what you make of it :-)

When i'm good i'm very good , but when i'm bad i'm shi#eloads better ;-) In't I " buttercups " p.m.s.l at authoritie !!;-)
ID: 1857253 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1857260 - Posted: 23 Mar 2017, 21:15:51 UTC - in response to Message 1857244.  

No need to reduce deadlines (and make a lot of older, and new ARM hardware) too slow to crunch for the project.
Just make a top limit of 2 days; instead of those with slower hardware who can carry 10+ days of work and faster systems running out in a matter of hours then slow systems still get to crunch, faster systems spend less time idle and the database remains (mostly) manageable.

If that`s possible then I think it sounds great.
ID: 1857260 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24912
Credit: 3,081,182
RAC: 7
Ireland
Message 1857273 - Posted: 23 Mar 2017, 22:31:19 UTC

According to IBM, Informix does have limits, so the question is: Has Seti reached those limits or will it ever reach those limits?

Informix limits
ID: 1857273 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1857278 - Posted: 23 Mar 2017, 22:47:24 UTC - in response to Message 1857242.  

A succinct analysis, Stephen. I agree that a reduced deadline would almost immediately have a huge impact on database size and performance. But do we really think we can change the mindset of the project scientists that SETI needs to be anything other than a screensaver app from the late 90's that uses your spare CPU cycles when you aren't using your computer? The project is still tuned to the lowest common denominator of PC hardware.


. . But that is the part I don't understand. If they simply want to enable low end machines to contribute on a part time basis then they will return only 1 or 2 WUs per week or maybe fortnight. If so they do not need a cache of more than 4 or 5 units (just in case they hit a few noise bombs) and the deadline does not need to be more than 50% longer than their average return time, if that is 1 week that would make it 10 days, if a fortnight then at absolute most 3 weeks. If people are not willing to contribute 10 to 15 hours per week of their PC's idle time then I wonder are they in any way serious about contributing at all? As much as I like to encourage participation and awareness of this project people need to be conscious they to take part they have to contribute some PC resources not just have BOINC installed and a couple of hundred tasks sitting on their hard drive not being processed. But that is a pet peeve of mine. From the time I first joined S@H I was very conscious of the slowness of the machine I was running then, a single core Pentium4 3GHz with no usable GPU, and the time it took to return a completed task (up to 13hours with the rig left on 24/7) compared to the downloaded tasks on my system (14 to 30). I was very conscious of the time they were sitting there for. Of course at first I did not know that BOINC/Seti would adjust the numbers issued as it evaluated the speed of processing, which at that end of the spectrum takes forever when it need dozens of returned tasks to complete the assessment and that took weeks to achieve :). I think that rig got up to a RAC of 300 over time. But it stayed pretty constant until MB V7 arrived and then it nose dived. Then the PSU blew and after that I gave up until last year. But even at that level of productivity I didn't need deadlines to be months long.

. . Sorry for the saga, but I wanted to make it clear I started as one of the "low end" contributors so many ppl seem so concerned about. And the current length of deadlines was unnecessary then and remains so now.

Stephen

?
ID: 1857278 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1857282 - Posted: 23 Mar 2017, 23:01:27 UTC - in response to Message 1857244.  
Last modified: 23 Mar 2017, 23:36:19 UTC

I agree that some relation between RAC and maximum number of workunits would be an improvement. However, to prevent the number of tasks that are out in the field from reaching all time highs, I think something else is needed. IMHO the high number of results out in the field is not caused by the top crunchers who usually return their results within 24 hours, but by the results that are returned (too) late. I would suggest to dramatically reduce the task deadlines (for some tasks currently over 2 months) to something like 2 or 3 weeks max. Even on relatively slow hardware it should be possible to return results in that timeframe.

No need to reduce deadlines (and make a lot of older, and new ARM hardware) too slow to crunch for the project.
Just make a top limit of 2 days; instead of those with slower hardware who can carry 10+ days of work and faster systems running out in a matter of hours then slow systems still get to crunch, faster systems spend less time idle and the database remains (mostly) manageable.


. . The shortcoming there is that assessment is based on the "capacity" of the machine as in the the number of tasks it could produce in a day based on runtimes, and NOT on returned results. This is why derelict hosts with some serious capacity such as i7 CPUs with some creditable GPUs can sit on hundreds of WUs until they time out and still get more. The only way to alleviate their impact is to reduce deadlines so that they cannot sit on them for so long holding them in limbo for months at the present time.

Stephen

.
ID: 1857282 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24912
Credit: 3,081,182
RAC: 7
Ireland
Message 1857284 - Posted: 23 Mar 2017, 23:14:17 UTC - in response to Message 1857278.  

+1

I started off with "Classic" I would let my rig crunch whenever I was at home. However due to the cost of Net usage back then, I would only access the Net after 6pm Friday until 8am Mon as BT's cost was 0.01p a minute. Work commitments prevented me continuing crunching, I was hardly at home.

On returning to a more settled work schedule, looked again for Seti but found Boinc instead. Continued crunching from where I left off :-)

What I have noticed with a few projects is that the amount of ram being used can curtail the actual use of the machine for which is was originally built. I did try gpu crunching but either I was unlucky in getting a bad card every time or I was doing something wrong, but have lost 5 cards since I tried it.

The original concept of Seti was to set & forget & it used unused clock cycles, since then, many have done nothing but make the fatcats in the utility companies fatter. That's their choice. The rest of us carry on living in the real world & provide what help we can to the projects.

To see the outcry & demands for explanations when the project itself stated from day one that it cannot guarantee work is hilarious.
ID: 1857284 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1857285 - Posted: 23 Mar 2017, 23:16:24 UTC - in response to Message 1857282.  

This whole discussion is mute. It's been rehashed time and time again. The only way it will ever change is IF and when Dr. A makes the decision.
ID: 1857285 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1857289 - Posted: 23 Mar 2017, 23:27:04 UTC - in response to Message 1857247.  
Last modified: 23 Mar 2017, 23:28:54 UTC

As one with a top twenty cruncher, which most of the contributors to this thread do not have,


. . Kudos on your contribution but that doesn't make your opinion any more valid or applicable than any one elses

I can honestly say - "You greedy bunch of twonks".


. . And I can honestly say "You arrogant snob!"

I live with the limits, the fact is that most weekly outages at least two of my crunchers run out of work. The key thing is what I said earlier, and this is repeating the long stated policy of SETI@Home, that there is NEVER going to be a guarantee of work from the project, be it for a short period of time, or over a longer period.


. . You are not alone, most of the people here have one or more rigs that run out of work during the ever lengthening outages. Your attitude is fine as a personal opinion but that does not mean others are not entitled to their own and have to adopt yours. As for the "no guarantee of work" that is fine too, in the sense there may not be work to be processed. But most of the users "moaning" here are concerned one way or another with the efficiency of the processing of their own rigs, and feel a personal involvement with the project to the extent they would like to see a high efficiency there as well, and that means using resources effectively. So when there IS work to be processed and there are these high capacity machines being made available to the project at no cost to the project only to the contributors, some of the those contributors, even if you don't, feel that some degree of support and effective utilisation of those resources would be nice. And on that point I would like to know how you justify, even to yourself, the use of the word 'greedy'. Since there is no reward for any contributor, great or small, except a warm feeling for getting it right and maybe making a difference.

So I would suggest if you don't like that go to another project and donate your time to that one. Otherwise stop moaning.


. . And I say to you if you don't like the "moaning" then perhaps you should restrict yourself to reading other threads and stop jumping all over the people expressing their concerns in this one.

Stephen

.
ID: 1857289 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24912
Credit: 3,081,182
RAC: 7
Ireland
Message 1857290 - Posted: 23 Mar 2017, 23:31:29 UTC - in response to Message 1857285.  

This whole discussion is mute. It's been rehashed time and time again. The only way it will ever change is IF and when Dr. A makes the decision.
Ah but didn't you know that in the Boincverse there are two addictions worse than smack & weed - TC & RAC.

It's non PC to upset those afflicted :-)
ID: 1857290 · Report as offensive
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 34 · Next

Message boards : Number crunching : Panic Mode On (105) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.