Message boards :
Number crunching :
concerns about CUDA and seti.....
Message board moderation
Author | Message |
---|---|
nick Send message Joined: 22 Jul 05 Posts: 284 Credit: 3,902,174 RAC: 0 |
Hi, I have noticed that several of the work units I have pending are going to error out and lose seti data, as i have a machine that has not had an error before, but when it runs against a CUDA GPU, they do not match,as in this case, http://setiathome.berkeley.edu/workunit.php?wuid=677997049 which I think will error out and be tossed out....but in this case, same machine, the third run of the WU is being done by a cpu, http://setiathome.berkeley.edu/workunit.php?wuid=677765004 and i think should match mine machine.... anyways, just thoughts on how we are crunching... Nick |
Helli_retiered Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0 |
Yup. I have seen that also (because my RAC reduced each day) on all of my Rigs since last outage. Several Task waiting because the Status is "Completed, validation inconclusive" with 2, 3 or 4 Results with different Client Software... Helli A loooong time ago: First Credits after SETI@home Restart |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
Hi, I have noticed that several of the work units I have pending are going to error out and lose seti data, as i have a machine that has not had an error before, but when it runs against a CUDA GPU, they do not match,as in this case, http://setiathome.berkeley.edu/workunit.php?wuid=677997049 which I think will error out and be tossed out....but in this case, same machine, the third run of the WU is being done by a cpu, http://setiathome.berkeley.edu/workunit.php?wuid=677765004 and i think should match mine machine.... http://setiathome.berkeley.edu/workunit.php?wuid=677997049 http://setiathome.berkeley.edu/workunit.php?wuid=677765004 Looks like one idiot running old V12 opimized on a Fermi and 6.09 got false -9 overflows. Would need to go to CPU or maybe somebody running x32f which produces much less false -9. I've no idea if the correct result will be thrown out when it clashes with two -9's. Bad luck - the errors are not on your side. Edit: correction. two idiots with V12 on a Fermi Carola ------- I'm multilingual - I can misunderstand people in several languages! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14682 Credit: 200,643,578 RAC: 874 |
Edit: correction. two idiots with V12 on a Fermi Both hosts (5305178, 5257703) are known offenders already on Joe Segur's list. |
nick Send message Joined: 22 Jul 05 Posts: 284 Credit: 3,902,174 RAC: 0 |
so i just happened to get matched with a pair for miss configured GPUS, and not all, or even most do this? Nick |
Frizz Send message Joined: 17 May 99 Posts: 271 Credit: 5,852,934 RAC: 0 |
Looks like one idiot running old V12 opimized on a Fermi ... Too much money - too little brains ;) Statistics like this make me laugh - or cry. Not decided yet :/ Number of tasks completed 3665 Max tasks per day 100 Number of tasks today 224 Consecutive valid tasks 0 (Possibly this has been discussed a million times, but why is "Max tasks per day" not reduced to ZERO for such hosts?) Petition against 1366x768 glare displays: http://www.facebook.com/home.php?sk=group_153240404724993 |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Yes it was discussed. Looks like FARMI not compatible V12 binary can correctly process tasks on some ARs. This allows small amount of validations. So, quota system is ineffective to inhibit such mis-configured hosts. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14682 Credit: 200,643,578 RAC: 874 |
so i just happened to get matched with a pair for miss configured GPUS, and not all, or even most do this? Exactly so. The tasks affected will be re-checked by another computer, and in the vast majority of cases, the result from the mis-configured host will be thrown out. Unfortunately, the bad hosts get through (and waste) a huge number of tasks every day. 5257703 has downloaded about 2,000 tasks (over 700 MB of data) in the last 24 hours. So you come across their droppings more often than you might expect. Given that they are consuming a non-trivial proportion of a scarce resource (bandwidth), I wonder whether the time has come to suggest that the hosts Joe identified should be blacklisted by the project (until/unless the software installation is corrected, of course). |
Helli_retiered Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0 |
We need a Server based Blacklist. LOL hehe Helli A loooong time ago: First Credits after SETI@home Restart |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Or to improve BOINC's quota system. It should protect from such cases too.... |
Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572 |
Looks like one idiot running old V12 opimized on a Fermi ... The problem with looking at "Consecutive valid tasks" is that anyone who has an invalid task for whatever reason will have it set back to zero, a more accurate assesment would be to look at the percentage of valid tasks over a set time period or specified quantity of work units, unfortunately this would probably be too server intensive. Kevin Kevin |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Looks like one idiot running old V12 opimized on a Fermi ... Not too server intancive. to calculate such proportion only number of failed tasks + number of total tasks are needed. Number of tasks today already kept. Nothing prevent to keep number of invalid tasks today too. The big problem with "daily" based % - task validation can be much later than task reception. So better to keep "total" % for host. |
Westsail and *Pyxey* Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 |
Call me an extremist but... Why not have: Consecutive valid tasks 0 = quota 1/day ? If you get an occasional invalid it will build back up fast.. also It will encourage running with an eye to accuracy. Just a thought...be gentle.. ;) *ducks* "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14682 Credit: 200,643,578 RAC: 874 |
We need a Server based Blacklist. LOL The facility exists: http://boinc.berkeley.edu/trac/wiki/BlackList (though there is some suggestion it may have been broken by the app_version breakdown). We'll never know unless the project decide it's enough of a problem to intervene. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Call me an extremist but... well, it willharm project performance but will not solve problem. Occasional valids will build up quota fast too. Something that takes host history into account is needed... EDIT: history is total % of validation for host for example. Hosts with low % should be penalized for long time aven if last few results are good ones. |
Frizz Send message Joined: 17 May 99 Posts: 271 Credit: 5,852,934 RAC: 0 |
Not too server intancive. to calculate such proportion only number of failed tasks + number of total tasks are needed. Number of tasks today already kept. Nothing prevent to keep number of invalid tasks today too. +1 Petition against 1366x768 glare displays: http://www.facebook.com/home.php?sk=group_153240404724993 |
Dr Grey Send message Joined: 27 May 99 Posts: 154 Credit: 104,147,344 RAC: 21 |
I've got one here. Because my set up is new I keep checking for invalids. Today I found this one: http://setiathome.berkeley.edu/workunit.php?wuid=677758481 The trouble is if you look at the record of the two other 'valid' computers they don't look so good. Who would you trust? My Mum used to tell me two wrongs don't make a right. Not saying my machine is right but it doesn't look as wrong as the others. |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
I've got one here. Because my set up is new I keep checking for invalids. Today I found this one: made link clickable. Yours is the right result that sadly (and fatally for the science database) was thrown out because it paired with two black sheep running a faulty app (or rather an app not usable on Fermi). bottom line - we are still getting false results inserted into the database because of Fermis set up the wrong way. Carola ------- I'm multilingual - I can misunderstand people in several languages! |
James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54 |
I have an inconclusive . I have 30 spikes, he has 31 pulses. Its from my CPU his fermi with V12 apps. Both -9 overflows. Wonder who will validate it? Its our buddy 5472266. here is the linkhttp://setiathome.berkeley.edu/workunit.php?wuid=679452515 [/quote] Old James |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14682 Credit: 200,643,578 RAC: 874 |
I have an inconclusive . I have 30 spikes, he has 31 pulses. Its from my CPU his fermi with V12 apps. Both -9 overflows. Wonder who will validate it? Its our buddy 5472266. here is the link http://setiathome.berkeley.edu/workunit.php?wuid=679452515 Stock Linux CPU app with a 1.9 day turnround - plenty of time to place your bets while we wait. I'll go for a second inconclusive and another resend ;-) |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.