Message boards :
Number crunching :
why do I get 0 credit when I claimed 40?
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Tom Gutman Send message Joined: 20 Jul 00 Posts: 48 Credit: 219,500 RAC: 0 |
The claimed credit play no part in the validation. What is compared is the actual results -- the part that actually matters. Different implementations of the same architecture should not cause any differences. While floating point results are only approximations to the real values, the results of a floating point operation is fully specificed by the architecture. So a P II or a P III or a P4 or an Athlon should all give the identical results, if running the same code. There are a couple of possible variations here. Different compilers, or different compile options, may result in different results. That is because the order of operations may vary (optimization if often based on mathematical identities, like associativity, that are not strictly obeyed by computer math) and the precision of intermediate results may vary (a compiler can choose instructions and settings for an 80 bit intermediate results or may choose instructions and settings for strict 64 bit IEEE arithmetic). It is possible for a single executable to have different code and choose which to run depending on the available instruction set. ------- Tom Gutman |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> Different implementations of the same architecture should not cause any > differences. While floating point results are only approximations to the real > values, the results of a floating point operation is fully specificed by the > architecture. So a P II or a P III or a P4 or an Athlon should all give the > identical results, if running the same code. Should is right ... Though it appears that IEEE 754 means that you should be getting the same results this is almost never realized in the real world. Because of the wide variation in the way that the FPU can be programmed to operate, inhenrent differences in internal table, etc. we may not wind up on the same county much less city ... Even different "steppings" of the processor though of the same class and speed may not give the same values. The good news is that the comparisons are done against the final results of the detection of pulses, Gaussians, etc. so that we still stand a chance. One of my rudest awakinings was when I found out that two processors, both IEEE 754 compliant could and did return different outputs. Heck, even the assumption of the starting values for the rounding scheme selected can cause severe problems. |
Tom Gutman Send message Joined: 20 Jul 00 Posts: 48 Credit: 219,500 RAC: 0 |
> One > of my rudest awakinings was when I found out that two processors, both IEEE > 754 compliant could and did return different outputs. Were those two processors running the same code? Not just the same source, but the same compiled code? Or were they running different code, either as two separate compiles or a compiler that generates alternative code for different processors? ------- Tom Gutman |
Siran d'Vel'nahr Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238 |
> > --- For whatever reason, SETI@Home decided to send a WU out to 3 > different > > PCs. Once the results are returned they are analized. If all 3 result > are > > close in comparison, all 3 get credit. If, however, 1 does not come > close to > > the other 2, credit is not granted for that one. Reasons for the 0.00 > granted > > credit could be overclocking, bugs in the client software, bugs in the > server > > software doing the analysis, etc., etc. > > ... or differences in the CPUs themselves. Two identical > (manufacturer/type/speed) processors should give absolutely identical > answers. > > I'd expect a small difference between the result from a Pentium II and a > Pentium 4 (or a K6 and an Athlon XP) > > I'd expect a slightly larger (but still small) difference between an Athlon XP > and a Pentium 4. > > ... and a little bit more when we leave the Intel architectures (Sun, Apple, > etc.) > > In theory, the validator has a little bit of a "fuzz factor" in the comparison > to allow reasonable differences between a K6 and a Sun workstation. This > might need a little tuning, or it may be working just fine and the results > really are a little too far apart. > Very good points, and you would think that the validator would know how to take this into account. But that's probably what you meant by "fuzz factor", right? L8R.... --- CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath |
Mark Gray Send message Joined: 3 Apr 99 Posts: 22 Credit: 236 RAC: 0 |
It's workunits like this one that get me worried (one of those hosts is mine even though it's under a different account name). There are two processors based on P4 technology in there. One is valid, the other isn't. My first thought was that it must be something wrong with my hardware, but given the number of people having similar problems, probably not. Of course, if it is something wrong with my hardware, then I'd like to know so I can do something about it. <a href="http://www.teamocuk.com">BOINC stats site - teamocuk.com</a> |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> > One > > of my rudest awakinings was when I found out that two processors, both > IEEE > > 754 compliant could and did return different outputs. > > Were those two processors running the same code? Not just the same source, > but the same compiled code? Or were they running different code, either as > two separate compiles or a compiler that generates alternative code for > different processors? Same code, equivelent hardware, same compiler and same source, different hardware, different compiler. THough the results with the same code and compiler were close ... |
stk Send message Joined: 2 Aug 00 Posts: 1 Credit: 257,858 RAC: 0 |
> I have noticed several times when I send a wu in I am given 0 credit when my > pc claimed say 40.Other two people got credit so why do I not get any?Thanks Rachel - You're not alone with your question. I was VIEWing my Results page today and noticed several WUs where I was granted 0 credit (which, looking back, hasn't happened very often with my computer "91726"). What up? 13586732 165848 25 Oct 2004 1 Nov 2004 Over Success Done 18,833.63 72.45 43.05 13586733 316357 25 Oct 2004 29 Oct 2004 Over Success Done 21,170.86 43.05 43.05 13586734 91726 25 Oct 2004 31 Oct 2004 Over Success Done 21,125.81 36.73 0.00 I click the result ID for my '0' credit WU & see only that the "Validate State" is INVALID. This is certainly meaningful? (I assume that during validation - HOWEVER THAT IS DONE - my returned WU 'failed' - HOWEVER THAT IS MEASURED.) No one likes to 'fail' & I am somewhat surprised, upon investigation, to learn that results are not IDENTICAL. In my own little-geophysicist world, I cotton to discrete, quantifiable answers and just ASSuMEd that BOINC was looking for 3 IDENTICAL results. I do note that my result was not returned last, rather, it was the 2nd result returned. So, I conclude that the first two returned results are NOT used to make a quorum. I suspect that the validation looks at a standard deviation, with bias toward the two WU results that are closest in "value". It is unfortunate that ZERO credit is awarded, despite the CPU time utilized. The BOINC people could have handled this so more POSITIVELY. Why use a negative word "failed"? Wouldn't it be better to say the result was 'statistically different', or 'out of bounds'? Additionally, if we're supposed to be in this for the science & NOT the credit, why assign credit? Because it pumps the competative juices and benefits the project! We all know of crunchers who BUY computers to build SETI farms. Doubt they'd do it without the motivating CREDIT factor! But still, if credits are to be taken lightly, why shouldn't SETI allocate credit even for a WU result that IS 'out of bounds'? What's the harm? In the end, wouldn't it be a more positive way of acknowledging time spent by the client? Just my thoughts. |
JAF Send message Joined: 9 Aug 00 Posts: 289 Credit: 168,721 RAC: 0 |
> Were those two processors running the same code? Not just the same source, > but the same compiled code? Or were they running different code, either as > two separate compiles or a compiler that generates alternative code for > different processors? And here's the problem. In a case where two results get credit and one in invalid, which result was correct? Two results that were credited and maybe crunched on the same hardware and software compiled with the same compiler/options, or the "invalid" unit? The invalid unit could very well be "more right". I remember doing a computer rehost of a major flight simulator. One of our biggest headaches in the software translation was "NAN's" (not-a-number) exceptions. Of course we were also changing FP format and translating some old legacy Fortran and C code. |
UBT - PaulT Send message Joined: 17 Dec 00 Posts: 25 Credit: 173,834 RAC: 0 |
I've just noticed this on the result I just sent in http://setiweb.ssl.berkeley.edu/workunit.php?wuid=2917435 Both machines that gained credit were running linux and claimed 27.94 credits I run Win Xp Home and claim 69.74 but get sod all Perhaps, as has been mentioned in other threads, version 4.05 is doing "Bad Science" and taking twice as long to do it. Doubt it's due to what processor as an Athlon XP and Intel Celeron both get credit yet my Athlon64 gets none and I would have thought that both athlons would return the same results. |
[ue] Hayko Send message Joined: 8 Dec 02 Posts: 1 Credit: 923,129 RAC: 0 |
Can seti@home give me my credits... http://setiweb.ssl.berkeley.edu/results.php?hostid=325554 http://setiweb.ssl.berkeley.edu/results.php?hostid=171739&offset=40 http://setiweb.ssl.berkeley.edu/results.php?hostid=67495 http://setiweb.ssl.berkeley.edu/results.php?hostid=164610&offset=20 http://setiweb.ssl.berkeley.edu/results.php?hostid=67495 yeah atleast 5000 credits are gone... if this keeps going i stop running seti/boinc |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> And here's the problem. In a case where two results get credit and one in > invalid, which result was correct? Two results that were credited and maybe > crunched on the same hardware and software compiled with the same > compiler/options, or the "invalid" unit? The invalid unit could very well be > "more right". You are correct. It is possible that the "Invalid" Result is actually more "correct" than the two that match. However, in the case of SETI@Home there is no significant downside to picking the two that match. Life is not involved, nor; ahem, serious science. So, with no downside, the best choice is to pick the two that "agree" and delcare them the winner. Probability says that the pair that agree are more (most) likely to be "correct" than the singleton ... |
JAF Send message Joined: 9 Aug 00 Posts: 289 Credit: 168,721 RAC: 0 |
> You are correct. It is possible that the "Invalid" Result is actually > more "correct" than the two that match. However, in the case of SETI@Home > there is no significant downside to picking the two that match. Life is not > involved, nor; ahem, serious science. So, with no downside, the best choice is > to pick the two that "agree" and delcare them the winner. Probability says > that the pair that agree are more (most) likely to be "correct" than the > singleton ... > Paul, I understand and agree with you. But I also think that some type of self test using "Seti like data" that runs at the normal Boinc Seti processing would at least let a person know if their CPU/memory/system was working well enough that they can successfully run Boinc Seti. If the floating point calculations are that "tight" that so many systems are getting "invalid results" then (to me) the science would be suspect. I suspect the problem is not with most user's systems. I would bet most are running the "stock" Seti Boinc and Seti client software. The problem is there is no way to validate your system if you get invalid results. Pumping Seti like data through Boinc Seti and getting the correct (expected) results would give one some confidence in their system. Then it would be a matter of running a different distributed project or putting up with the "invalids" until the problem is fixed. I have three computers; the two I build myself have the least amount of "invalid results" but utilize the cheapest components (but are faster than the more expensive system). I keep them up-to-date and regularly run diagnostics to check them out. watch the case and CPU temperatures and they are not over-clocked. The problem with the diagnostics is they don't exactly test (crunch) data like Seti does, so there's always doubt that the diagnostic results are valid. So, give me a "self test" option the feeds the Seti program realistic WU data with a compare results. If my system can't calculate the pre-defined data and come up with the correct results, then I should expect "invalid results" when I crunch regular Seti Work Units. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> Paul, I understand and agree with you. But I also think that some type of self > test using "Seti like data" that runs at the normal Boinc Seti processing > would at least let a person know if their CPU/memory/system was working well > enough that they can successfully run Boinc Seti. Yes, a pre-defined WU with "Known" answers would do this. But it should be a regular WU so the test is a valid exercise. Most people would not run it because of the time it might take, but this is a good thought. Suggest it to Dr. Anderson... > If the floating point calculations are that "tight" that so many systems are > getting "invalid results" then (to me) the science would be suspect. Well, I am not sure I would go that far. The fact of the matter is that we have systems that deliver calculations out to 76 places with all but the first few being needed. For old folks like me who remember slide rules, we have this take that more than the first 3 places is so much noise ... > So, give me a "self test" option the feeds the Seti program realistic WU data > with a compare results. If my system can't calculate the pre-defined data and > come up with the correct results, then I should expect "invalid results" when > I crunch regular Seti Work Units. As I said, a good suggestion. Hopefully it will be taken up and added to the things to do ... > |
JAF Send message Joined: 9 Aug 00 Posts: 289 Credit: 168,721 RAC: 0 |
> Yes, a pre-defined WU with "Known" answers would do this. But it should be a > regular WU so the test is a valid exercise. Most people would not run it > because of the time it might take, but this is a good thought. Suggest > it to Dr. Anderson... I was hoping to circumvent the upload/download and Boinc server processes so the test would be simply check one's processor, memory, operating system and hard drive n a real world "crunching" environment. The "known data" should be actual data from a WU (a high angle unit might be desirable). If one's system passed for a reasonable amount of time, well, that would be about all one could do. The invalid results would seem to be out of the users control. > > If the floating point calculations are that "tight" that so many systems > are > > getting "invalid results" then (to me) the science would be suspect. > > Well, I am not sure I would go that far. The fact of the matter is > that we have systems that deliver calculations out to 76 places with all but > the first few being needed. For old folks like me who remember slide rules, > we have this take that more than the first 3 places is so much noise ... > I've done quite a bit of crunching with Boinc Seti (over 56,000 credits) on three systems and have gotten several "invalids" in the middle of the night when I'm not using the computer and when the ambient temperature is rather low, so I just suspect the "invalids". You are probably correct though, that most users wouldn't take the time to run the test By the way, I still have my four slide rules; two full size, one bamboo and one aluminum?, a six inch model, and a circular. Anyway, I was just attempting to figure out a way to give a person a way to check their system for problems on their end that are under their control. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> I was hoping to circumvent the upload/download and Boinc server processes so > the test would be simply check one's processor, memory, operating system and > hard drive n a real world "crunching" environment. The "known data" should be > actual data from a WU (a high angle unit might be desirable). Because it is a "test" WU, it could be made part of the install. Especially since the WU files is small. ONce completed, a compare with a "Master Answer" could be done (also downloaded with the WU). So, there would be no network activity. As a stress test for HW this would be ideal. > By the way, I still have my four slide rules; two full size, one bamboo and > one aluminum?, a six inch model, and a circular. I only have two I think, my bamboo, a metal one, and maybe a circular... |
AthlonRob Send message Joined: 18 May 99 Posts: 378 Credit: 7,041 RAC: 0 |
> I see two types of failure to give proper credit. > > 1) My machine reports a much lower # of CPU seconds (even though the credit > claimed is similar) > Being penalized for having a really fast machine seems terribly unfair. My apologies if I'm misunderstanding what you're typing... If you're claiming a similar amount of credit, but reporting a lower number of CPU seconds, you aren't being penalized at all. A faster CPU will require fewer CPU seconds (less processing time) than a slower CPU to complete the same task. Boinc, via the benchmarks and complicated credit system, makes up for this by granting you the same credits, even though it took you less time than the slower CPUs. You're getting the same amount of credit, so you aren't being penalized. Rob |
zmaniac Send message Joined: 3 Apr 99 Posts: 3 Credit: 89,577 RAC: 0 |
> > I see two types of failure to give proper credit. > > > > 1) My machine reports a much lower # of CPU seconds (even though the > credit > > claimed is similar) > > > Being penalized for having a really fast machine seems terribly unfair. > > My apologies if I'm misunderstanding what you're typing... > > If you're claiming a similar amount of credit, but reporting a lower number of > CPU seconds, you aren't being penalized at all. A faster CPU will require > fewer CPU seconds (less processing time) than a slower CPU to complete the > same task. If you will examine the WUs linked in my post you will see that what is occurring on those occasions is that I'm being given *zero* credit while the other two computers get credit. The two cases I posted about with supporting links are examples of repeating cases of zero credit and are pretty clear evidence of bugs in the credit decision code (although my analysis of the reason for zero credit could easily be wrong). I'm a Mac programmer working with customized BOINC SETI for G5, but if these problems with credit are not going to be fixed I'm just gonna pass and leave it to others. Jim |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.