Project Status - 08/26/2005 4pm PST

Message boards : Number crunching : Project Status - 08/26/2005 4pm PST
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next

AuthorMessage
Steven Wilcox
Volunteer tester

Send message
Joined: 23 Sep 99
Posts: 36
Credit: 86,104,929
RAC: 131
United States
Message 158241 - Posted: 27 Aug 2005, 18:36:59 UTC

Just a thought about bringing the system back on line. Would it help to allow uploads only until all those pending results are back. Only then do you allow new workunits to be sent out. I think some workunits may go out that don't need to be sent only because the finished unit has not been reported. If your going to clean out the Queue's then all queue's (client systems) need to get flushed so there's no overlap.

Steve
ID: 158241 · Report as offensive
Profile MJKelleher
Volunteer tester
Avatar

Send message
Joined: 1 Jul 99
Posts: 2048
Credit: 1,575,401
RAC: 0
United States
Message 158243 - Posted: 27 Aug 2005, 18:41:24 UTC - in response to Message 158205.  

To the best of my understanding, that is pretty much the way things work now. The only time the 'formula' is used is when the 4th result comes back before the first three get validated. Then, the high and low are discarded, and the 2 remaining ones are averaged. Of course, this all assumes that all of the returns pass validation.

Close. The formula is always used. If there are four results to be credited, the high and low are tossed, the remainder is averaged. If there are three results to be credited, the high and low are tossed, and again the remainder is averaged. Since there is only one remaining, of course its average is itself. And if the fourth result comes in before its deadline, it gets credit equal to the other three results.

Credit Granting Rules in the Wiki.

MJ


ID: 158243 · Report as offensive
Bronco
Volunteer tester
Avatar

Send message
Joined: 22 Jun 05
Posts: 123
Credit: 19,340
RAC: 0
France
Message 158299 - Posted: 27 Aug 2005, 21:37:40 UTC - in response to Message 158241.  

Just a thought about bringing the system back on line. Would it help to allow uploads only until all those pending results are back. Only then do you allow new workunits to be sent out. I think some workunits may go out that don't need to be sent only because the finished unit has not been reported. If your going to clean out the Queue's then all queue's (client systems) need to get flushed so there's no overlap.

Steve

Not sure it will be easy to do.

But may be, they'll better clean WfV for results with deadline before the shutdown, then reopen the scheduler with validator off. Not the best way to work, but, as far as my understanding of the process is good, this will allow results currently reported after deadline to be granted credit. And assimilation/deletion seems to have some jobs to catch up during this time 'cause they are off actually.

May be splitters should be kept off (at least some of them) until the team is sure that everything's going right ...
"In a world without walls and fences, who needs windows and gates ?"
for the team
ID: 158299 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 158301 - Posted: 27 Aug 2005, 21:44:40 UTC

WFV is under a million....
ID: 158301 · Report as offensive
Bronco
Volunteer tester
Avatar

Send message
Joined: 22 Jun 05
Posts: 123
Credit: 19,340
RAC: 0
France
Message 158310 - Posted: 27 Aug 2005, 21:58:44 UTC - in response to Message 158301.  
Last modified: 27 Aug 2005, 22:04:38 UTC

WFV is under a million....

999 999 left ... lol

If the rate keeps steady, and it would better increase, it should reach 0 at noon UTC
"In a world without walls and fences, who needs windows and gates ?"
for the team
ID: 158310 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 158312 - Posted: 27 Aug 2005, 22:08:03 UTC - in response to Message 158310.  
Last modified: 27 Aug 2005, 22:08:37 UTC

WFV is under a million....

999 999 left ... lol

If the rate keeps steady, and it would better increase, it should reach 0 at noon UTC

Actually 992,000 and change as of the last update.

The page updates every 10 minutes, but the data sources don't necessarily change at the same time. That means we can see updates that don't change just because they're too early/too late.
ID: 158312 · Report as offensive
Profile cliff west

Send message
Joined: 7 May 01
Posts: 211
Credit: 16,180,728
RAC: 15
United States
Message 158355 - Posted: 27 Aug 2005, 23:41:50 UTC

Waiting for validation 940,029

getting closer
ID: 158355 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 158367 - Posted: 28 Aug 2005, 0:02:18 UTC
Last modified: 28 Aug 2005, 0:02:48 UTC

From another thread...
But if the file access time is/was the problem and the anti-Q's are down below 600,000, then isn't it the unvalided files that is the problem now? That is, they must have the bulk of the file space. If so, then why weren't the file deleters turned off for a while and the validators turned on full blast?


Looks like they are running the validators solely now and those special jobs for the antiques are gone. But I still don't get it. The validators don't delete the files; the file deleters do. But they aren't running. So even though the validators are working, the file system is not getting cleaned up. I would have thought that they might have at least one or two pipelines open (assimilator->file deleter) because the problem they are trying to fix is the file system, right?

Not the first time I've been confused by this operation.

May this Farce be with You
ID: 158367 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 158373 - Posted: 28 Aug 2005, 0:11:15 UTC

Speaking of schedules, all of us around here have a keen interest in keeping everything working and everybody happy. Speaking for myself, I have "regular" hours in the day, but when I'm not at the lab I'm checking in every few hours unless I'm away from a computer (or sleeping). I tend to keep a late schedule, and Jeff tends to keep an early one - so there's a "blackout" period between 2am and 5am when Jeff and I are probably asleep. Of course, I'm not sure of Rom's or Dave's schedule but it seems equally random. See - it's Saturday and I'm writing this post (and oddly enough, in between a gig I just played in Palo Alto and a gig tonight in SF - I'm just taking a break for a scant few minutes back at the house).

Logging in from home to turn validators on/off takes me about 45 seconds. It's not that big a deal.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 158373 · Report as offensive
Profile Tern
Volunteer tester
Avatar

Send message
Joined: 4 Dec 03
Posts: 1122
Credit: 13,376,822
RAC: 44
United States
Message 158391 - Posted: 28 Aug 2005, 0:54:30 UTC - in response to Message 158373.  


Logging in from home to turn validators on/off takes me about 45 seconds. It's not that big a deal.

- Matt


It may not be a big deal, but you have a lot of people out here who really appreciate your taking the time to do it when you are "off duty"!

Thanks Matt!
ID: 158391 · Report as offensive
Jim Volfan

Send message
Joined: 22 May 99
Posts: 52
Credit: 24,239,706
RAC: 90
United States
Message 158403 - Posted: 28 Aug 2005, 1:39:59 UTC

Down to 914000 and some change. I've had results validated for the 14th. We're getting there. Matt, you are appreciated.

ID: 158403 · Report as offensive
Scarecrow

Send message
Joined: 15 Jul 00
Posts: 4520
Credit: 486,601
RAC: 0
United States
Message 158406 - Posted: 28 Aug 2005, 1:46:55 UTC - in response to Message 158373.  

I'm not at the lab I'm checking in every few hours unless I'm away from a computer (or sleeping).

Sleeping?! I'll send along my "Better System Administration Through Better Pharmaceuticals" pamphlet for you to look at.

(For those that didn't pick up on it, that was humor)

Thanks Matt!
:)

ID: 158406 · Report as offensive
Martin Johnson

Send message
Joined: 9 Jun 01
Posts: 201
Credit: 224,995
RAC: 0
United Kingdom
Message 158411 - Posted: 28 Aug 2005, 2:24:45 UTC

Um, er, ahem! The validators are slowing somewhat, should we switch to deleting?
ID: 158411 · Report as offensive
Profile Shaktai
Volunteer tester
Avatar

Send message
Joined: 16 Jun 99
Posts: 211
Credit: 259,752
RAC: 0
United States
Message 158470 - Posted: 28 Aug 2005, 4:29:17 UTC - in response to Message 158411.  

Um, er, ahem! The validators are slowing somewhat, should we switch to deleting?


Now how do you figure that? From what I see, the average hourly pace continues, and it is a pretty steep dive. I don't see any signs of slowing. Please point out the details of how you made that determination.


Team MacNN - The best Macintosh team ever.
ID: 158470 · Report as offensive
Profile Scottatron

Send message
Joined: 15 Jul 03
Posts: 94
Credit: 220,389
RAC: 0
Australia
Message 158518 - Posted: 28 Aug 2005, 7:32:51 UTC

Seeing as though the scheduler is offline, maybe getting the most powerful S@H server on to the validation queue would be a good move?

I mean, Penguin is a Sun D220R (2 x 440MHz Sparc, 2 GB RAM), and Galileo is Sun E3500 (6 x 400MHz Sparc, 6 GB RAM) - This is from the server status page so if these are not correct blame the person who updates that page ;).

Maybe configuring all the machines to be capable of running any process is a feasible idea? Of course letting them run their primary config is the best, but in a case like this, nearly all the servers could be switched to "Validation Mode" and clear things up quick smart?
ID: 158518 · Report as offensive
Bronco
Volunteer tester
Avatar

Send message
Joined: 22 Jun 05
Posts: 123
Credit: 19,340
RAC: 0
France
Message 158548 - Posted: 28 Aug 2005, 9:38:06 UTC
Last modified: 28 Aug 2005, 9:39:48 UTC

If the file system is the bottleneck, and it's probably the case, it won't help that much. It would ne easier to add some more process on penguin anyway. And nothing allow usto tell that it's not the case (I don't believe that the status page is autoadaptative)

At the moment, the validation is doing a pretty good job, next assimilators and deletors will have some work to clean all what is done.
"In a world without walls and fences, who needs windows and gates ?"
for the team
ID: 158548 · Report as offensive
Metod, S56RKO
Volunteer tester

Send message
Joined: 27 Sep 02
Posts: 309
Credit: 113,221,277
RAC: 9
Slovenia
Message 158633 - Posted: 28 Aug 2005, 13:56:58 UTC
Last modified: 28 Aug 2005, 13:58:57 UTC

Assimilators are now running and validators are off. I guess they are off to allow assimilators full access to file storage.

But then again, assimilators have to do their job in order to allow deleters later on remove even canonical result files (these account for approximately 1/4 of uploaded files eventually).

My guess is that in an hour or so we'll see deleters running ...
Metod ...
ID: 158633 · Report as offensive
Profile cjsoftuk
Volunteer tester

Send message
Joined: 3 Sep 04
Posts: 248
Credit: 183,721
RAC: 0
United Kingdom
Message 158636 - Posted: 28 Aug 2005, 14:02:42 UTC - in response to Message 158633.  

Assimilators are now running and validators are off. I guess they are off to allow assimilators full access to file storage.

But then again, assimilators have to do their job in order to allow deleters later on remove even canonical result files (these account for approximately 1/4 of uploaded files eventually).

My guess is that in an hour or so we'll see deleters running ...


Don't forget that the scheduler started itself as well!
ID: 158636 · Report as offensive
Metod, S56RKO
Volunteer tester

Send message
Joined: 27 Sep 02
Posts: 309
Credit: 113,221,277
RAC: 9
Slovenia
Message 158640 - Posted: 28 Aug 2005, 14:07:36 UTC - in response to Message 158636.  
Last modified: 28 Aug 2005, 14:30:31 UTC

Don't forget that the scheduler started itself as well!


There might be a green box there but the attitude towards clients is just the same:

Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of -106

BTW, what does this return value really mean? I know there's a list of codes on Wiki but I can't find it ... :(
Metod ...
ID: 158640 · Report as offensive
Profile RandyC
Avatar

Send message
Joined: 20 Oct 99
Posts: 714
Credit: 1,704,345
RAC: 0
United States
Message 158644 - Posted: 28 Aug 2005, 14:12:58 UTC - in response to Message 158310.  
Last modified: 28 Aug 2005, 14:13:46 UTC

WFV is under a million....

999 999 left ... lol

If the rate keeps steady, and it would better increase, it should reach 0 at noon UTC


Well....As of 28 Aug 2005 14:00:07 UTC (past noon UTC...unless you mean Monday)
Waiting for validation: 595,548

Regardless, it's a big improvement!!!


Final Classic total: 11446 WU
Classic CPU hours: 72,366
ID: 158644 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next

Message boards : Number crunching : Project Status - 08/26/2005 4pm PST


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.