When should Seti jetison the weak?

Message boards : Number crunching : When should Seti jetison the weak?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 891221 - Posted: 4 May 2009, 16:26:31 UTC - in response to Message 890557.  


Speaking of laptops, I never tried a fun test: Let it run on batteries doing something typical, like typing emails or staring at a spreadsheet. Do this until the battery dies. Now do the same thing with SETI@home running an optimized application. (Keep the screen on to make it an apples-to-apples comparison.) I bet it eats up the battery in half the time.

Nearly 4 1/2 hours on my Fujitsu C1410 without SETI. Under 2 hours crunching.

It's a Core2 Duo something....
ID: 891221 · Report as offensive
Profile Bob Mahoney Design
Avatar

Send message
Joined: 4 Apr 04
Posts: 178
Credit: 9,205,632
RAC: 0
United States
Message 891282 - Posted: 4 May 2009, 19:48:02 UTC - in response to Message 891221.  


Speaking of laptops, I never tried a fun test: Let it run on batteries doing something typical, like typing emails or staring at a spreadsheet. Do this until the battery dies. Now do the same thing with SETI@home running an optimized application. (Keep the screen on to make it an apples-to-apples comparison.) I bet it eats up the battery in half the time.

Nearly 4 1/2 hours on my Fujitsu C1410 without SETI. Under 2 hours crunching.

It's a Core2 Duo something....

Ned,
Thank you for abusing your laptop in the name of science. :)

That is an impressive battery drain with the CPU, FSB and memory going full tilt, eh?
ID: 891282 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 891308 - Posted: 4 May 2009, 20:58:19 UTC - in response to Message 890677.  

@Josef:

Thanks for your response. You may be the only one who understood my question (and chose to answer it in the sense it was asked).

I wonder, however. You focused on bandwidth as your main concern. Isn't the dormant storage a bigger problem, in that it forces larger server storage, and longer backup and restore times.

For example, if the hosts were either all quads or all P-II's, under stable conditions, the Q will be 20x faster than the P-II. This means that the server storage can be 20x smaller in the Q case than the P-II case to process the same amount of work per week, all else being equal. Here I'm assuming the deadlines can be reduced accordingly. Doesn't this then reduce the overhead costs I was mentioning and benefit the project, albeit perhaps at the risk of increased bandwidth of course?

(If so, then the next step in the logic might be to reject the Q's in favor of the CUDA's. But since I've been accused here of being an elitist, I probably don't like angry populist mobs and so won't broach the topic. Yet, I have learned over the decades that those who rush to accuse others of elitism typically, ignorantly, and unabashedly exercise their own version of elitism.)

ID: 891308 · Report as offensive
Profile elbea64

Send message
Joined: 16 Aug 99
Posts: 114
Credit: 6,352,198
RAC: 0
Germany
Message 891311 - Posted: 4 May 2009, 21:13:00 UTC

storage is not a problem neither financial nor technical

bandwidth is expensive if it is technical realizable at all
ID: 891311 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 891322 - Posted: 4 May 2009, 21:49:38 UTC - in response to Message 890958.  


I think they do. It looks like I've been targeted for APs. :)

There's a much higher return rate for MB unit's. In the last hour there has been 38,691 MB unit's returned & 2,491 AP unit's returned. It looks as if there are more people crunching MB units than AP units. It would be interesting to see what % of CPU power is on MB units and what % of CPU power is on AP units. Per hour it looks as if AP units are on the increase

ID: 891322 · Report as offensive
Profile Megacruncher TSBT
Volunteer tester

Send message
Joined: 16 Jul 99
Posts: 7
Credit: 21,941,723
RAC: 0
United Kingdom
Message 891351 - Posted: 4 May 2009, 23:34:08 UTC

Blessed are the weak for they shall inherit the earth (or is thet the meek? Same difference).
Let us jetison no-one. The whole point of Seti/Boinc is that the world is full of under-utilised PCs (some slow some fast) giong to waste so they might as well be used for science.

However some sad souls, and I plead guilty, have perverted this vision and set up dedicated Boinc farms of elite computers that do nothing else but run Boinc & inflate their owners credit.

We may be strong but the world would be a better place (morally and in terms of energy consumption) if only the weak, but all of the weak, were doing Boinc.

The Scottish Boinc Team
ID: 891351 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 891352 - Posted: 4 May 2009, 23:34:15 UTC - in response to Message 891282.  


Speaking of laptops, I never tried a fun test: Let it run on batteries doing something typical, like typing emails or staring at a spreadsheet. Do this until the battery dies. Now do the same thing with SETI@home running an optimized application. (Keep the screen on to make it an apples-to-apples comparison.) I bet it eats up the battery in half the time.

Nearly 4 1/2 hours on my Fujitsu C1410 without SETI. Under 2 hours crunching.

It's a Core2 Duo something....

Ned,
Thank you for abusing your laptop in the name of science. :)

That is an impressive battery drain with the CPU, FSB and memory going full tilt, eh?

Yes, and the fan is much noisier at full-tilt.
ID: 891352 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 891355 - Posted: 4 May 2009, 23:42:34 UTC - in response to Message 891308.  

@Josef:

Thanks for your response. You may be the only one who understood my question (and chose to answer it in the sense it was asked).

I wonder, however. You focused on bandwidth as your main concern. Isn't the dormant storage a bigger problem, in that it forces larger server storage, and longer backup and restore times.

For example, if the hosts were either all quads or all P-II's, under stable conditions, the Q will be 20x faster than the P-II. This means that the server storage can be 20x smaller in the Q case than the P-II case to process the same amount of work per week, all else being equal. Here I'm assuming the deadlines can be reduced accordingly. Doesn't this then reduce the overhead costs I was mentioning and benefit the project, albeit perhaps at the risk of increased bandwidth of course?

(If so, then the next step in the logic might be to reject the Q's in favor of the CUDA's. But since I've been accused here of being an elitist, I probably don't like angry populist mobs and so won't broach the topic. Yet, I have learned over the decades that those who rush to accuse others of elitism typically, ignorantly, and unabashedly exercise their own version of elitism.)


Fast machines like a Core2 Quad, paired with a Core2 Quad wingman, aren't going to have work in the database for long, but they will keep calling for new work, adding back records as quickly as they're released. They'll always have more workunits in their queue than a slow machine.

The P-II work units will hang around a lot longer, but there won't be very many.

I'd have to do some simulations, but I'd bet in "like-machine" pairings, the quads hurt more (but they're worth it, because they do so much more).

The ugly case is a Core2 Quad vs. a P-II (or my lowly Via C7) -- but again this is limited by the speed of the slow machine, and the (relatively small) amount of work it will fetch in a given time.

... and I am ignoring idle hosts (those who haven't reported in months or years) and I'm ignoring the "cost" of a machine that disappears. I think new hosts should be limited to very little work until they've returned valid results, but that's a different discussion.

I don't think the idle machines cost that much in database size/performance. They hurt a little when it is time to do backups.
ID: 891355 · Report as offensive
daysteppr Project Donor

Send message
Joined: 22 Mar 05
Posts: 80
Credit: 19,575,419
RAC: 53
United States
Message 891416 - Posted: 5 May 2009, 1:21:16 UTC
Last modified: 5 May 2009, 1:22:09 UTC

My biggest issue with the concept of ' jetisoning the weak' is where would it stop? First its p2's. Than P3's. Than what? AMD's as well? Its already a well proven fact that AMD's dont crunch as fast as intel chips. Should I be ejected from Seti because I dont use an Intel chip?

I crunch an AP 5.03 WU in 41-44 hrs. ( amd x2 4600+ 2.4 ghz) with optimized apps. A c2q q6600 (2.4 ghz) using optimized does it in 1/2 to 1/3rd the time. Does this mean Im not good enough to crunch seti? I think Im good enough to crunch seti, even if others dont think so.

First they came for the Pentiums and I said nothing. Than they came for the Pentium 2's ands I said nothing. They came for the Pentium 3's and I said nothing. When they came for the AMD's there was nothing left to say....

IMO if someone wants to use a P2 to crunch, Im ok with that. So it takes longer to validate. I think I'll survive. If they want to pay the electricty to run it, let them. Im not going to tell someone what they should or shouldnt do with thier machines. If I start, it means they can tell me what to do with my machine ....

Sincerely,
Daysteppr
ID: 891416 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65736
Credit: 55,293,173
RAC: 49
United States
Message 891419 - Posted: 5 May 2009, 1:43:05 UTC - in response to Message 891416.  

My biggest issue with the concept of ' jettisoning the weak' is where would it stop? First its p2's. Than P3's. Than what? AMD's as well? Its already a well proven fact that AMD's don't crunch as fast as intel chips. Should I be ejected from Seti because I don't use an Intel chip?

I crunch an AP 5.03 WU in 41-44 hrs. ( AMD x2 4600+ 2.4 GHz) with optimized apps. A c2q q6600 (2.4 GHz) using optimized does it in 1/2 to 1/3rd the time. Does this mean I'm not good enough to crunch seti? I think I'm good enough to crunch seti, even if others don't think so.

First they came for the Pentiums and I said nothing. Than they came for the Pentium 2's and I said nothing. They came for the Pentium 3's and I said nothing. When they came for the AMDs there was nothing left to say....

IMO if someone wants to use a P2 to crunch, I'm ok with that. So it takes longer to validate. I think I'll survive. If they want to pay the electricity to run it, let them. I'm not going to tell someone what they should or shouldn't do with their machines. If I start, it means they can tell me what to do with my machine ....

Sincerely,
Daysteppr

Couldn't agree more Daysteppr, It's their electricity and their bill too, Not ours.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 891419 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 891480 - Posted: 5 May 2009, 4:33:53 UTC - in response to Message 891308.  

@Josef:

Thanks for your response. You may be the only one who understood my question (and chose to answer it in the sense it was asked).

I wonder, however. You focused on bandwidth as your main concern. Isn't the dormant storage a bigger problem, in that it forces larger server storage, and longer backup and restore times.

That was my conclusion in terms of adding new hosts, and it wouldn't really matter what kind of hosts; if productivity were increased by 50% the download bandwidth would become a critical issue. It already is whenever there's a large proportion of 'shorty' work.

However, your hypothesis is that not delivering any more work to the least productive hosts might have enough benefit to outweigh both the lost productivity and the onus of having become more 'elitist'. So lets look at a couple of scenarios resulting from reducing the deadlines (those control whether the Scheduler will assign a task to a given host).

Scenario 1: Reduce MB deadlines by a factor of 5. (The calculation in the splitter uses a 40 million whetstone base, change it to 200 million). So a shorty deadline would be 7/5 = 1.4 days, midrange around 4.5 to 5 days, max around 12 days. That's enough to have a definite effect. Almost all active hosts would be capable of doing the work in time if they crunch 24/7, some only crunching part time and some like my 200 MHz Pentium MMX with ~176 million whetstones wouldn't make the cut. So lost productivity would be fairly small. It would reduce the amount of storage space needed for workunits since turnaround would be faster. But it would make host queues longer than 1 day impractical, and my guess is that could lead to loss of quite a few of the top hosts too, reducing productivity more. Even users with ordinary hosts might leave after their hosts went idle during a moderately long project outage.

Scenario 2: Same reduction, but floor the deadlines at 5 days. That allows a queue of 3 or 4 days to cover project accidents. Slow hosts would be able to do shorties, so stubborn users with such systems would usually be able to get work if they tried long enough. So there's very little productivity decrease, and maybe a reasonable amount of protection against the 'elitist' tag. The shorter deadlines for longer work reduce storage requirements somewhat. The main problem I see is that BOINC allows setting 10 days of extra work and the project wouldn't fulfill that setting.

So now we bring Astropulse into consideration. In terms of the slowest hosts it was out of range. But in terms of users who want to have 10 days of work on hand it's nearly ideal.

OTOH, I still don't see any major advantage to the project or users from reduced deadlines for MB. The Overland Storage hardware donation seems to have put an end to Matt's comments about running out of room to store workunits. So the advantage seems to me basically shortened turnaround for MB which would tend to reduce pending a little. I don't really understand the concern about prompt credit granting, and Astropulse is the main factor there anyhow.

For example, if the hosts were either all quads or all P-II's, under stable conditions, the Q will be 20x faster than the P-II. This means that the server storage can be 20x smaller in the Q case than the P-II case to process the same amount of work per week, all else being equal. Here I'm assuming the deadlines can be reduced accordingly. Doesn't this then reduce the overhead costs I was mentioning and benefit the project, albeit perhaps at the risk of increased bandwidth of course?

Either a PII or a quad can do MB work in less than a day, but the project average turnaround time for MB is seldom less than 2 days. Twenty PII hosts with queues of 2 days have precisely the same effect on storage as one quad with a queue of 2 days. That is, it takes about 2 days from the time a task is assigned until the result is reported. Not all of the project average is due to queue length; there are some hosts which take longer than 2 days to do MB work either because they are inherently slow or because they are allowed to crunch only part time. But given the average RAC of 233 I derived in the earlier post, we have to think that crunch time is a minor factor in average turnaround time.
                                                                  Joe
ID: 891480 · Report as offensive
HFB1217
Avatar

Send message
Joined: 25 Dec 05
Posts: 102
Credit: 9,424,572
RAC: 0
United States
Message 891496 - Posted: 5 May 2009, 5:28:50 UTC
Last modified: 5 May 2009, 5:43:37 UTC

I am very lucky besides being addicted to cutting edge computers. I can afford to do new computers at a rather fast rate. One of my hobbies is crunching for Seti and WCG. The impetus for upgrading also is made rewarding by punching out faster and more Workunits for both projects.

Others may not have the funds or desire to upgrade consistently to max out processing workunits. They still want to be felt relevant and contributing to a cause. So should they be denied this just because they can not or do not need to have a super fast system. It is a basic concept of distributive computing to have the crunchers fell welcome and important regardless of the amount of work they can produce.

The only fault I can see is sometimes we must wait for a wingman to complete a task and that is really no issue that matters if it really makes a difference then maybe someone other than the slower cruncher has the serious problem and the understanding of what we are all about.

Sorry if this is too blunt or offensive.

Hank
Come and Visit Us at
BBR TeamStarFire


****My 9th year of Seti****A Founding Member of the Original Seti Team Starfire at Broadband Reports.com ****
ID: 891496 · Report as offensive
Ianab
Volunteer tester

Send message
Joined: 11 Jun 08
Posts: 732
Credit: 20,635,586
RAC: 5
New Zealand
Message 891507 - Posted: 5 May 2009, 6:30:33 UTC

I dont think there is any specific plan or need to 'jetison' older PCs. As long as they can return work units in a sensible time, they are contributing usefull work and are welcome in the project. The actual amount of database load that a few old Celerons or PIIIs put on the project is negligible. Some might say the work they do is neglible too, but thats not the point.

But at some point upgrades to the software mean that older machines just cant do the job any more. I have an old PII 300 laptop that can still run BOINC, but it dosn't have enough ram to run seti. So effectively it has been 'jetisoned'.

Now I imagine that sometime in the future there will be new apps that wont run on a PIII / win98 / 128mb. Then those machines will get jetisoned. Thats just the way it is.

But if your machine meets the minimum specs for the project, then join up and crunch. I seem to have more wingman problems with modern, maybe overclocked, rigs that seem to crash, vanish or return heaps of invalid results.

If your old PIII takes a week to return a valid result, so what? Relaible machines that meet the minimum spec and turn in good results inside the deadlines is what the project needs.

As things are now, you dont need a Bleeding Edge machine to contribute usefull work. Any P4 that runs for 8 hours a day will complete AP units in good time, and old P4s are sort of thing you can get for free by asking nicely. So there is not really any excuse for NOT having a PC that meets minimum specs.

Ian

Ian
ID: 891507 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 891619 - Posted: 5 May 2009, 14:58:38 UTC - in response to Message 891480.  

(quote deleted)

All good points. It helped that you could quantify your model.


OTOH, I still don't see any major advantage to the project or users from reduced deadlines for MB. The Overland Storage hardware donation seems to have put an end to Matt's comments about running out of room to store workunits. So the advantage seems to me basically shortened turnaround for MB which would tend to reduce pending a little. I don't really understand the concern about prompt credit granting, and Astropulse is the main factor there anyhow.

For example, if the hosts were either all quads or all P-II's, under stable conditions, the Q will be 20x faster than the P-II. This means that the server storage can be 20x smaller in the Q case than the P-II case to process the same amount of work per week, all else being equal. Here I'm assuming the deadlines can be reduced accordingly. Doesn't this then reduce the overhead costs I was mentioning and benefit the project, albeit perhaps at the risk of increased bandwidth of course?

Either a PII or a quad can do MB work in less than a day, but the project average turnaround time for MB is seldom less than 2 days. Twenty PII hosts with queues of 2 days have precisely the same effect on storage as one quad with a queue of 2 days. That is, it takes about 2 days from the time a task is assigned until the result is reported. Not all of the project average is due to queue length; there are some hosts which take longer than 2 days to do MB work either because they are inherently slow or because they are allowed to crunch only part time. But given the average RAC of 233 I derived in the earlier post, we have to think that crunch time is a minor factor in average turnaround time.
                                                                  Joe

I wasn't worried about throughput in my original thoughts. However, a Q with two days work holds more storage hostage in the Berkeley servers, than a P-II would. Storage was my main concern; not so much running out of it, but the maintenance of it.

So I wonder what the seti staff would say are the biggest issues to solve? It seems to me that bandwidth, while limited, is not a problem today, nor is the total storage capacity. There must be something that keeps those folks busy that we can 'fix', in a virtual sense here on these boards.

ID: 891619 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 891632 - Posted: 5 May 2009, 15:49:41 UTC
Last modified: 5 May 2009, 15:51:15 UTC

Actually I dont buy the basic premise of this argument. Lets just say that on average all host have a 2 days cache. Think carefully about this

1 C2D machine doing 20 WUs a day will keep 40 WUs in its cache. The project gets 20WUs done in a day

10 P3 machines doing 2 WUs a day will keep 40 WUs in its cache. The project gets 20WUs done in a day by these machines

In each case, the net data consumed is the sum of the 20 WUs per day. The storage requirement is 40WUs at any time. It does not matter if it is done by 1 fast machine or 10 slow ones

So on a global project perspective, as long as the same computational power is maintained, the workload and transfer load is still the same. If the project wants to get that 20 WUs a day with population host having a 2 days cache, it will have to transfer 20 WUs a day, keep 40 WUs in its storage.

Think of it like the First Law of Thermodynamics, or any conservation laws for that matter.

The simple thing is, if the project thinks it has too much computing power, it should jettison some old clients. Or have them do other things. This was one reason why BOINC is there.

Right now the project needs more processing power. Unless its ready to do with less, jettisoning the weak will weaken the project. There are no operational benefit to be had. THe only benefit is the overall "Greeness", if that is a measurable.
ID: 891632 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 891717 - Posted: 5 May 2009, 23:40:29 UTC - in response to Message 891632.  

I don't think the 'thermodynamics' argument is applicable here, because at the very least it suggests that we should run more slow machines than fast ones. Say 20:1! 100:1! Etc.

Maybe I'm wrong, but I think that the amount of time the average wu is waiting to be validated is an important concern because of the concomitant server storage/maintenance demand. This sort of latency obviously will be greater with the slower machines than with the faster ones.

For example, in the (unrealistic) limit there is no latency then we would chew through the data tapes as fast as they are split. So we would need minimal server storage and backup resources. More specifically, if a CUDA array could be set up to provide zero latency, then the amount of overhead (not counting communications, which is important) for the array would be very small; an equivalent array of P-II's that has the same return rate would necessitate a whole lot of storage at Berkeley, waiting for each wu to be returned. (I'm ignoring the quorem here for simplicity.)

The role of the cache is to make the latency worse (and hard to predict because, in part, the users have control of the cache settings). But this 'cost' is compensated by the need to keep the hosts computing during downtimes or when a host cannot connect up continuously. It is obviously necessary.
ID: 891717 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 891808 - Posted: 6 May 2009, 3:11:43 UTC

SETI@Home should jettison the weak ASAPF.
ID: 891808 · Report as offensive
Rob.B

Send message
Joined: 23 Jul 99
Posts: 157
Credit: 1,439,682
RAC: 0
United Kingdom
Message 891809 - Posted: 6 May 2009, 3:25:54 UTC
Last modified: 6 May 2009, 3:26:18 UTC

Currently this is a mute point, Seti@home is loosing clients at quite a pace for a variety of reasons, in a recent post one of the staff commented on the fact that they do not have enough client crunching power for thir needs. Taking this into account excluding whole classes of machines (just because they are not the latest model, or percived as too slow by some) is not really the way to go.

Rob.
ID: 891809 · Report as offensive
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 891815 - Posted: 6 May 2009, 4:13:19 UTC

My celeron 533 completes an average WU in about 48 Hrs.

It currently has 5 WU's pendinding with wingman who are running C2D, quad or dual xeons.

According to the apparent logic of some of the posters in this thread, these "modern" machines are ar creating problems of storage etc by returning results so slowly, and should be jettisoned as well.

[Rant = ON]
Sounds more like some people don't want to wait a few days tbefor they get credit, so they want to get rid of "anyone" that causes a delay and use the poor excuse of blaming those with a slower processor for the delays.

CHILL OUT

[Rant = OFF]
ID: 891815 · Report as offensive
Bounce

Send message
Joined: 3 Apr 99
Posts: 66
Credit: 5,604,569
RAC: 0
United States
Message 891824 - Posted: 6 May 2009, 4:46:33 UTC
Last modified: 6 May 2009, 4:49:49 UTC

The thread reads like many complaints from drivers on the highway. Anyone slower than me is an obstruction and anyone who drives faster than me is a nutjob.

An opposite argument could be made that the most powerful systems should be dropped because of their demand on throughput of the system and the strain is places on the project to keep up with the up/downloads.

Reality is that both groups are needed. If 50% of the hypothetical 200k machines are returning 1 WU a week, then that's 100k WU a week that S@H would not have if they were pruned from the system. That's a substantial contribution to discard for the instant gratification of WU stat hounds.
ID: 891824 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : When should Seti jetison the weak?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.