Message boards :
Number crunching :
Cancelled by project question
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
| Author | Message |
|---|---|
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
That's what i was just thinking. Slow machines wouldn't have been contributing to the science before now anyway. Nothing's changed in that regard. True. As I already said above though, I didn't know about it before. It just came to my attention after this new feature. I honestly thought that every returned result, no matter what place in line it was returned, was valid and used for science. It makes a big difference for those that don't want to pay for electricity just to be the follow-up guy. I'm willing to pay the electric bill if it's going to good (i.e. useful) use. |
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
If the quorum is met, and you return work after that time, it has always been scientifically irrelevant. What's the difference? As I said above, the difference is in the fact that I wasn't aware my third results was irrelevant. Now I am. Now I see there's no point in running slower machines other than credit. |
|
Ingleside Send message Joined: 4 Feb 03 Posts: 1546 Credit: 15,832,022 RAC: 29
|
The same as what? The same as before? Yes. But now it's work is deemed scientifically irrelevant. That's the difference now. Nothing has really changed, since any results returned after wu has been validated has never been used for anything except credit-purposes. The odds for being last has maybe increased, now many of the fast computers "needing" a 10-day cache gets many of their uncrunched wu's cancelled. Anyway, after the release of Multi-beam, the plan is to wait around a week to see if any problems, and afterwards stop sending-out a 3rd. result except on errors/past deadline. Meaning, all SETI-results passing Validation will finally be scientifically useful, as long as they're returned before the deadline. "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
|
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0
|
Hey guys, remember lots of people seem to be running large caches. So, if either of the other 2 is running a 5 day cache or more (however fast the boxes) you will have crunched and responded before they get down to it in the cache. I guess this is saying that a small cache is better for slow machines so you start crunching as soon as a WU is sent out. [Edit]I've got about 25 results "Pending" that were sent out more than 5 days ago. At 110 hours, your machine would have been doing useful science on any one of them. [/Edit] |
Henk Haneveld Send message Joined: 16 May 99 Posts: 154 Credit: 1,577,293 RAC: 2
|
Wrong. I agree that there will be times that the first and second results will be returned after you have started your result and before your result is returned. But if you run a large cache the chance the work you have will either be partnerd with at least one other slow host or is work that has be abandoned by at least one host is larger than when you run a small cache. So running a slow host with a large cache is to my opion still usefull. Edit: With large cache I mean the maximum of 10 days.
|
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 12990 Credit: 208,696,464 RAC: 690
|
If the quorum is met, and you return work after that time, it has always been scientifically irrelevant. What's the difference? That's what i was just thinking. Slow machines wouldn't have been contributing to the science before now anyway. Nothing's changed in that regard. EDIT- even when 3 results were used for the quorum the odds of a slow machine making up one of those 3 results would have been very, very slim. Grant Darwin NT |
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
... and if there is just one work unit at a time, then the behaviour is effectively the same. If the quorum is met, and you return work after that time, it has always been scientifically irrelevant. What's the difference? |
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
Look at it this way...Once the system is up and running it should work like this for good old slow-and-sure systems: You're still thinking in multiples. I don't download a "bunch" of workunits. I download one single workunit and it fills the cache. The servers never get a chance to abort anything because it's already working on the single workunit. I will spend 110 hours crunching a workunit, return it, get credit for it, but it will not be relevant science. Just credit. Except... You want me to justify paying an electric bill based upon some small fraction of a chance that someone will error out? The chances are too small for me to care about. I'm not going to pay for a horde of slow crunchers to suck up electricity just to hope I can clean up somebody else's mess. It's simply not worth it. |
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
Keep in mind that "connect every '0' days" may be slightly dangerous: it is possible to report before the scheduler knows about the upload. That's why I suggested a small, but non-zero value. I think the issue of having a connect to of 0 is irrelevant. The point is, it downloads a single workunit and it fills the entire cache. Before 5.10.x, your very slow machines were likely to be the last to report, and while no work was aborted, you were usually last (and perhaps, redundant). Yes, but I wasn't aware of the redundancy until it became an issue or feature. On 5.10.x, if you have a work unit that is already redundant, it has a good chance of being aborted before you crunch it. Only on fast machines. On my slow machine, which downloads a single workunit and it fulfills my entire cache setting, then no workunits will be aborted before I crunch it. It will download one, work on it, and always be last/redundant. There's no point to running SETI on slow machines if you want to do useful work. ... and if there is just one work unit at a time, then the behaviour is effectively the same. The same as what? The same as before? Yes. But now it's work is deemed scientifically irrelevant. That's the difference now. If I want to do scientifically useful work, there's no point to running SETI on slower machines anymore. |
|
john_morriss Send message Joined: 5 Nov 99 Posts: 72 Credit: 1,969,221 RAC: 110
|
I have noticed that, on average, if a system doesn't return a WU within a 24-ish hour window, it will always be the "last man in" (i.e. the third & redundant result). Look at it this way...Once the system is up and running it should work like this for good old slow-and-sure systems: You get a bunch of WU, put them at the end of your cache and continue working (On what, you ask? Wait a second). Two other speed demons get the same WU, blast thru it finish it and report. The next time you connect, you're told to forget about those WUs, with no time spent crunching. Except... What if one of those speed demons errors out, or they fail to agree. Now who's the man? Who has the final say on what's what? Those over-clocked water-cooled multi-processor freaks? Not on your life! As long as you finsish before the deadline, everyone will wait for you! That's your role in life, cleaning up other people's mistakes... So you will be doing serious work, and all it's costing you is some D/L time... Or so it looks to me...
|
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
Keep in mind that "connect every '0' days" may be slightly dangerous: it is possible to report before the scheduler knows about the upload. That's why I suggested a small, but non-zero value. That isn't saying there may be a better use of the electricity, but I think the new client actually makes your old, slow machines more likely to be effective (more likely to return one of the first two results). Before 5.10.x, your very slow machines were likely to be the last to report, and while no work was aborted, you were usually last (and perhaps, redundant). On 5.10.x, if you have a work unit that is already redundant, it has a good chance of being aborted before you crunch it. ... and if there is just one work unit at a time, then the behaviour is effectively the same. |
|
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0
|
I can give a little background information, at least. All the apps, including the MMX one, include our hand vectorized code subroutines as well as standard versions. The app tests which can be used and decides which is fastest. On James' system it is probable that some SSE routines are being used even though he's using the MMX app. OTOH, we're using IPP for FFTs and the MMX build wouldn't get a vectorized version. In addition, the Intel compiler may autovectorize some other areas for the SSE build. James' system isn't the only SSE capable system on which the SSE build fails, though it's fairly rare. There's no obvious cause, I'm not sure I could figure it out even if I decided to concentrate a lot of effort there. Joe |
|
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0
|
Hmmm... That's interesting. Maybe Simon or someone else from the Coop will see this and provide some insight. I agree though, a compute error is a definte showstopper. ;-) Alinator |
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
I have noticed that, on average, if a system doesn't return a WU within a 24-ish hour window, it will always be the "last man in" (i.e. the third & redundant result). My Connect To is at 0 and my Extra Cache is 2.75 days. It does not matter what the cache is because on these slow machines, it will download a single workunit that will fill up an entire 3 day cache. It will start processing it, making the servers unable to cancel the workunit if two have already returned by faster machines, and it will take 110 hours to complete, making it the last man in. It will then return the result (third one of course), download a new one, lather, rinse, repeat. That isn't saying there may be a better use of the electricity, but I think the new client actually makes your old, slow machines more likely to be effective (more likely to return one of the first two results). Not true in practice. Perhaps the theory wasn't worked out too well on this idea. Of course I like the idea of doing more useful science, but now all my slower machines have become redundant. |
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
I have noticed that, on average, if a system doesn't return a WU within a 24-ish hour window, it will always be the "last man in" (i.e. the third & redundant result). Oh, I don't know.... If you are running the current 5.10.x client, and you have a short "connect every 'x' days" (like maybe 0.1) and you have a large cache using the "extra days" functionality, you'll be returning work reasonably fast. Probably before a fast machine with "connect every '4' days" or somesuch. That isn't saying there may be a better use of the electricity, but I think the new client actually makes your old, slow machines more likely to be effective (more likely to return one of the first two results). |
|
James Nelson Send message Joined: 23 Mar 02 Posts: 381 Credit: 4,806,382 RAC: 0
|
@ James: yes it does say sse but it sse doesnt work it runs for a while and errors out it wont run the whole way through so it must not be fully implimented .
|
|
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0
|
@ James: Hmmmm.... I was looking over your host list. When I took a look at some of the Result summaries for the 500 MHz Celeron I noticed the Coop app was reporting it as a Coppermine Celeron and capable of SSE. So you should be able to squeeze a little more out of it by running the SSE Coop app on it. Of course this assumes it wasn't being mis-identifed by the app. Alinator <edit> @ Ozz: Yeah the 500MHz PII part is what got me to take a closer look at the host list. ;-) Also, if I wasn't slugging it out with Dr. Watson for Class supremacy right now, my K6's would be crunching elsewhere. ;-) |
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
one is a PII 500 the other is a PIII 700 my fast computers are amd XP 2100 and 2800, I know not so fast next to a quad core but It's the best I have. PIIs only went up to 450MHz. PIIIs started at 450MHz (Katmai) and went up to 600MHz before getting a revision (Coppermine). Your PII 500MHz is either a PIII or its a PII-based Celeron. Still, those chips are relatively fast. My K6-2 500MHz is about as fast as a PII 266MHz or 300MHz. Or even my P233MMX. It was always third man in on every task returned. Older Pentiums, Pentium IIs, AMD K5s and AMD K6 series processors all seem to be wasting electricity by contributing to my RAC without producing useful science. |
|
James Nelson Send message Joined: 23 Mar 02 Posts: 381 Credit: 4,806,382 RAC: 0
|
I have noticed that, on average, if a system doesn't return a WU within a 24-ish hour window, it will always be the "last man in" (i.e. the third & redundant result). one is a PII 500 the other is a PIII 700 my fast computers are amd XP 2100 and 2800, I know not so fast next to a quad core but It's the best I have.
|
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15687 Credit: 84,761,841 RAC: 62
|
I have noticed that, on average, if a system doesn't return a WU within a 24-ish hour window, it will always be the "last man in" (i.e. the third & redundant result). How old and slow? Mine is a K6-2 500MHz and it is always the third man in. Always redundant, even with the MMX optimized app. I would not consider a PIII 866MHz machine "old and slow". PII and AMD K6 would be old and slow. |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.