Please turn on splitters

Message boards : Number crunching : Please turn on splitters
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Timcom99

Send message
Joined: 30 Sep 04
Posts: 105
Credit: 8,927,290
RAC: 0
United States
Message 158763 - Posted: 28 Aug 2005, 19:01:35 UTC

Be Cool if today they could take Uploads Only but no Downloads. People have lots of Work done out there that are getting Close to timing out. Uploads are only 1/20th the Size of Downloads and should be easier to handle. Get rid of all the Work that is out there and let it be Validated before opening the Spigots and sending out New work units.
ID: 158763 · Report as offensive
Profile Shaktai
Volunteer tester
Avatar

Send message
Joined: 16 Jun 99
Posts: 211
Credit: 259,752
RAC: 0
United States
Message 158770 - Posted: 28 Aug 2005, 19:11:26 UTC
Last modified: 28 Aug 2005, 19:16:22 UTC

Has anyone actually considered the possibility that the SETI team has considered all these issues, and actually has a game plan for this? There is noone else who knows the system strengths, limitations and problems better then they do, despite the claims of pseudo experts who have never worked with anything on this scale or quite like it.

This is their battle, give them a chance to fight it on their terms. The splitters will be turned on when it is most advantageous for the project.

Just because you don't know the details of their plan, doesn't mean they don't have one. You can't treat SETI BOINC like a home computer or even a corporate intranet. It is much bigger and more complex then that. It's rather impressive, that such a small team can manage such huge global project and all for a fraction of the cost that it would take in the private-commercial sector.


Team MacNN - The best Macintosh team ever.
ID: 158770 · Report as offensive
Profile The Simonator
Avatar

Send message
Joined: 18 Nov 04
Posts: 5700
Credit: 3,855,702
RAC: 50
United Kingdom
Message 158777 - Posted: 28 Aug 2005, 19:26:18 UTC - in response to Message 158770.  

Has anyone actually considered...the private-commercial sector.

I agree, trying to compare the S@h project to a home computer is like trying to compare a family car to a Terex Titan, one may be a lot faster and easier to control, but the Titan can carry a much bigger load.
The Berkeley team should be given a medal for managing to run this at all (IMHO).
Life on earth is the global equivalent of not storing things in the fridge.
ID: 158777 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 158789 - Posted: 28 Aug 2005, 19:39:58 UTC - in response to Message 158770.  

Has anyone actually considered the possibility that the SETI team has considered all these issues, and actually has a game plan for this?


I agree, and I think that most folks here understand what you are saying as well. I believe that the majority of posts in this thread are meant solely to show support or opposition to Tom's initial statements, rather than anyone trying to tell Berkeley what they should or shouldn't do. People will always try to second-guess the devs, and as long as folks are hanging around waiting for things to get started again, they'll want to post their opinions and discuss the various possibilities of what comes next. I don't think it harms anyone as long as it stays nice :)

Dig
ID: 158789 · Report as offensive
Steven Wilcox
Volunteer tester

Send message
Joined: 23 Sep 99
Posts: 36
Credit: 86,104,929
RAC: 131
United States
Message 158793 - Posted: 28 Aug 2005, 19:50:12 UTC - in response to Message 158756.  

Just to throw my $.02 in.

Disabling all downloads for about a day, will accelerate recovery at least in this case. And here is my rationale: there are 950+K WU's out there (roughly 2x what is waiting to be validated right now), and I'm also guessing there is a number of results that haven't made quorium yet (they aren't counted in any easily accessable place that I know of). That is 2 days for the validator running flat out (assuming 30K an hour as a max).

We have a couple known issues that come into play here as well. 1) We know that the project gets hammered after an outage, and that connecting is sporatic at best. 2) Over due units that would be resent (having timd out), are sitting out here done. Put those together, and allowing work to be sent out would be a huge detriment. Without DL work, we should clear the work out here inside a day. That keeps us from resending expired WUs that end up coming in an hour later(luck of the draw on connecting), and that should help keep the orphaned files down from this restart. This whole time its been about cleaning the system out, and IMO every machine that crunches for Seti is a part of that system (and the last step of the cleaning is to consolidate what is out there, then see where we are).

That said, if we can build a stash of work up, but keep it from getting assigned (when we go online again) for a little bit, go right ahead and fire up the splitters.



I Agree see my post http://setiathome.berkeley.edu/forum_thread.php?id=19152#158241
from yesterday

Steve
ID: 158793 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 158801 - Posted: 28 Aug 2005, 20:11:58 UTC - in response to Message 158685.  

I agree that splitters probably should stay off until the WFV is caught up and the files are cleared out.

But I'd define 'recovery is complete' as something that will take place quite a while afterwards. For me, recovery is complete would be the point where the user side has cleared out its backlogs (required downloads and uploads) and is operating normally - AND Berkeley is handling the load. Based on history, for an extended outage like this, we're probably talking mid/late this coming week at best to meet that definition of 'recovery is complete'.

We do not want to add more files to an overloaded file system. Splitters should stay off until the recovery is complete.


ID: 158801 · Report as offensive
Profile ghstwolf
Volunteer tester
Avatar

Send message
Joined: 14 Oct 04
Posts: 322
Credit: 55,806
RAC: 0
United States
Message 158816 - Posted: 28 Aug 2005, 20:59:36 UTC - in response to Message 158793.  
Last modified: 28 Aug 2005, 21:00:16 UTC


I Agree see my post http://setiathome.berkeley.edu/forum_thread.php?id=19152#158241
from yesterday

Steve


I hadn't seen that post before, I'm just glad to know that I'm not the only person thinking this way (not a normal situation for me to be in the majority, I do have some crazy thoughts).

@Shaktai- I'm sure they have a plan, I'm also sure that the Devs will do what they see as best. Most importantly, I'm sure that the Devs really aren't looking at this thread for advice (Maybe a momentary distraction, and a laugh). I was taking the thread (and what I offered up as my "plan") for what it is: speculation and opinion. You are right, we (the users) don't know what the plan is, and until we do this will continue. If it were me, I'd rather see a thread like this (civil and filled with interest about what is going on /whats next), than yet another "BOINC sucks"/"It's broken beyond hope" thread.


Still looking for something profound or inspirational to place here.
ID: 158816 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 158818 - Posted: 28 Aug 2005, 21:01:17 UTC - in response to Message 158801.  

I agree that splitters probably should stay off until the WFV is caught up and the files are cleared out.

But I'd define 'recovery is complete' as something that will take place quite a while afterwards. For me, recovery is complete would be the point where the user side has cleared out its backlogs (required downloads and uploads) and is operating normally - AND Berkeley is handling the load. Based on history, for an extended outage like this, we're probably talking mid/late this coming week at best to meet that definition of 'recovery is complete'.

We do not want to add more files to an overloaded file system. Splitters should stay off until the recovery is complete.


We are going to disagree on our definition of "recovery is complete" then.

We know from past experience that there will be a near-crushing load when the scheduler and upload/download servers are restarted. We know that SETI won't handle that load initially. We know that the validator queue is going to climb alarmingly for the first couple of days.

... and we know that once the load starts to flatten out, the validators will catch up, work will be transitioned, and life will return to normal.

We've seen it before.

We won't know that everything is back to normal for a week after SETI goes active, but that isn't part of the recovery.
ID: 158818 · Report as offensive
Profile Shaktai
Volunteer tester
Avatar

Send message
Joined: 16 Jun 99
Posts: 211
Credit: 259,752
RAC: 0
United States
Message 158836 - Posted: 28 Aug 2005, 21:51:29 UTC - in response to Message 158816.  

@Shaktai- I'm sure they have a plan, I'm also sure that the Devs will do what they see as best. Most importantly, I'm sure that the Devs really aren't looking at this thread for advice (Maybe a momentary distraction, and a laugh). I was taking the thread (and what I offered up as my "plan") for what it is: speculation and opinion. You are right, we (the users) don't know what the plan is, and until we do this will continue. If it were me, I'd rather see a thread like this (civil and filled with interest about what is going on /whats next), than yet another "BOINC sucks"/"It's broken beyond hope" thread.


@ ghstwolf -- Well, I agree, this thread has been more civil then some. I guess from my perspective, I would like to see folks thinking in terms of a bigger picture. I'm not IT, but work closely with IT on frequent occasions (investigations) and know that there is a lot more complexity then is apparent to the average user. I appreciate intelligent discussion around possible solutions, but grow weary of "do it now" statements, or assumptions from people who have not considered the variables involved.

Probably I should take a breather from the forums, but probably won't yet. I'm too interested in observing the progress, so that I can learn from it. I guess that is just a natural inclination for a fraud investigator-analyst.

My intent though, was to try and get folks to take a step back and see the bigger picture, though I may have not succeeded in that goal. Discussion is good, but it needs to be balanced. I think Ned has pretty well hit the nail on the head though with his thoughts of what to expect. It is the most logical course of action. Berkeley already knows that the servers are going to get slammed. They will deal with that as best they can and have most likely devised a plan, but only they can determine the best time to switch or add functions. Of course they are human (thankfully) and may make mistakes. That's okay, they are learning too.



Team MacNN - The best Macintosh team ever.
ID: 158836 · Report as offensive
Profile Keck_Komputers
Volunteer tester
Avatar

Send message
Joined: 4 Jul 99
Posts: 1575
Credit: 4,152,111
RAC: 1
United States
Message 158849 - Posted: 28 Aug 2005, 22:11:07 UTC

I believe that the best throttle for the seti system is the splitters. If they remain off while other processes resume it should reduce the overall load since no new workunits are entering the system. I (and others) have suggested reducing the rate of splitting as a permanent stopgap measure for any time the system starts backing up. That doesn't mean the problem causeing the backup should not be addressed, but would provide more breathing room when there is a problem.

I also think that after an outage like this the reply should include fewer workunits (possibly even zero) than normal and a server requested delay. This would help spread the crush out especially if the server requested delay is more than 4 hours.
BOINC WIKI

BOINCing since 2002/12/8
ID: 158849 · Report as offensive
Profile ghstwolf
Volunteer tester
Avatar

Send message
Joined: 14 Oct 04
Posts: 322
Credit: 55,806
RAC: 0
United States
Message 158869 - Posted: 28 Aug 2005, 22:54:40 UTC - in response to Message 158836.  

Shaktai- It's all good. Being able to take a step back is always good advice, for example: an hour later, I'm not sure why I felt singled out by your post. Or even more so, why I didn't walk away for a little bit, then see if I felt the same way later (it's not like you specificly called me out), like I usually do.

I'm big enough to say I'm sorry, I certainly didn't intend to sound quite the way the post read. Sitting in a bar, with our beverages of choice it would have come off a little different. I usually try to read every post with that in mind, and I hope most other people do too, but these are times where most people seem a bit on edge around here.


Still looking for something profound or inspirational to place here.
ID: 158869 · Report as offensive
Profile Shaktai
Volunteer tester
Avatar

Send message
Joined: 16 Jun 99
Posts: 211
Credit: 259,752
RAC: 0
United States
Message 158887 - Posted: 28 Aug 2005, 23:47:16 UTC - in response to Message 158869.  
Last modified: 28 Aug 2005, 23:51:04 UTC

... but these are times where most people seem a bit on edge around here.

I think we are all anxious for things to get back to normal. I can only imagine how the Berkeley team feels. They are probably more frustrated then we are, but being scientists, know the value of not letting emotions control your thoughts. Step back, analyze the situation, evaluate options and then try again, hoping that you are making the right choices.

Don't worry, I didn't take your post personal. It was a reasonable comment.

--------
Thomas Edison --
1. I have not failed. I've just found 10,000 ways that won't work.
2. Opportunity is missed by most people because it is dressed in overalls and looks like work.
3. Genius is one percent inspiration, ninety-nine percent perspiration.
4. Just because something doesn't do what you planned it to do doesn't mean it's useless.
5. If we all did the things we are capable of doing, we would literally astound ourselves.
--------



Team MacNN - The best Macintosh team ever.
ID: 158887 · Report as offensive
Profile MJKelleher
Volunteer tester
Avatar

Send message
Joined: 1 Jul 99
Posts: 2048
Credit: 1,575,401
RAC: 0
United States
Message 158897 - Posted: 29 Aug 2005, 0:06:11 UTC - in response to Message 158887.  

--------
Thomas Edison --
1. I have not failed. I've just found 10,000 ways that won't work.
2. Opportunity is missed by most people because it is dressed in overalls and looks like work.
3. Genius is one percent inspiration, ninety-nine percent perspiration.
4. Just because something doesn't do what you planned it to do doesn't mean it's useless.
5. If we all did the things we are capable of doing, we would literally astound ourselves.
--------

These would be soooo appropriate as sig lines these days....

MJ


ID: 158897 · Report as offensive
Profile [HWU] GHz & CO. - BOINC.Italy
Volunteer tester
Avatar

Send message
Joined: 1 Jul 02
Posts: 139
Credit: 1,466,611
RAC: 0
Italy
Message 158934 - Posted: 29 Aug 2005, 1:31:06 UTC

LOL splitters are Running NOW!

With more wu that are ready to send at the re-opening of the scheduler, the users can download wu quikly and don't wait again seeing "no work from project" for other days.

I think that now the upload/download server is clean and the splitters don't slow down so much the work of validators
GHz
BOINC.Italy
ID: 158934 · Report as offensive
itenginerd
Avatar

Send message
Joined: 1 Aug 00
Posts: 37
Credit: 39,905
RAC: 0
United States
Message 158935 - Posted: 29 Aug 2005, 1:31:23 UTC

I agree that this is a moot point--the crew in charge will do what they do regardless of what we think/say/do.

But at the same time: the splitters are now up and genning new WUs.
ID: 158935 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 158971 - Posted: 29 Aug 2005, 4:01:22 UTC - in response to Message 158818.  

Yup -- I consider 'recovery is complete' when 'everything is back to normal'. You don't. No big deal, just a matter of doing some translation.

I suspect part of that definition difference reflects the arguments that shows up in some of the less civil threads around here.



We are going to disagree on our definition of "recovery is complete" then.

We won't know that everything is back to normal for a week after SETI goes active, but that isn't part of the recovery.


ID: 158971 · Report as offensive
Profile Shaktai
Volunteer tester
Avatar

Send message
Joined: 16 Jun 99
Posts: 211
Credit: 259,752
RAC: 0
United States
Message 158980 - Posted: 29 Aug 2005, 4:26:36 UTC

Well, I'm going to guess that the project team feels they will be able to soon start sending out work again. Probably they had a pretty good sense of how long it would take to generate the work. By tomorrow morning Berkeley time, they will have a pretty good cache.

Team MacNN - The best Macintosh team ever.
ID: 158980 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 158981 - Posted: 29 Aug 2005, 4:30:53 UTC - in response to Message 158971.  

Yup -- I consider 'recovery is complete' when 'everything is back to normal'. You don't. No big deal, just a matter of doing some translation.

I suspect part of that definition difference reflects the arguments that shows up in some of the less civil threads around here.

There is a whole interesting discussion about what constitutes recovery. I think as long as the average perfomance can stay ahead of the bursts, we're there.

Trouble is, we won't know that until a few days after the project reopens.

ID: 158981 · Report as offensive
Profile Shaktai
Volunteer tester
Avatar

Send message
Joined: 16 Jun 99
Posts: 211
Credit: 259,752
RAC: 0
United States
Message 158985 - Posted: 29 Aug 2005, 4:38:44 UTC

Since they have started ramping up the splitters, I am wondering if they are planning on opening up by end of day tomorrow. No guarantees, but it seems that they may consider it an option. The splitters are now generating a lot of new units, at about twice the speed the validators are running. Well tomorrow is a work day, I'll just let things take their course and check it out after work tomorrow.

Good night all.


Team MacNN - The best Macintosh team ever.
ID: 158985 · Report as offensive
Swibby Bear

Send message
Joined: 1 Aug 01
Posts: 246
Credit: 7,945,093
RAC: 0
United States
Message 158986 - Posted: 29 Aug 2005, 4:38:54 UTC

"Recovery is Complete" sounds very much like George W. Bush's statement that "Mission is Accomplished". Neither is true. Although recovery for SETI IS a lot closer to being complete than Bush's ill-advised war. Ooooops - Sorry. I didn't mean to start a rant. I was just comparing the statements.
ID: 158986 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Please turn on splitters


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.