Message boards :
Number crunching :
Do not get credits etc
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
JAF Send message Joined: 9 Aug 00 Posts: 289 Credit: 168,721 RAC: 0 |
I believe Ned is on the right track with his explanation. Since the project needs (and should support) the many processors and operating systems we have today, loosening up the validation might not be the best solution. However, I wonder if the project could write and distribute a "seti stress test program" that one could run to check their computer. The program could cycle on "Seti like calculations" using canned data so the expected results would be consistent. A simple pass/fail message would suffice. If a person starts getting "invalid results" they could run this program to see if their system is having problems. Of course this wouldn't be 100 percent conclusive, since there could still be some intermittent problems that would be hard to catch. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I think the project itself is the stress test. The cool bit would be adding an "index of accuracy" of some sort, maybe as simple as saying "on average, 0.1% of all results get 0 credit, for your machine, 20% of all results get 0 credit." > However, I wonder if the project could write and distribute a "seti stress > test program" that one could run to check their computer. The program could > cycle on "Seti like calculations" using canned data so the expected results > would be consistent. A simple pass/fail message would suffice. |
Captain Avatar Send message Joined: 17 May 99 Posts: 15133 Credit: 529,088 RAC: 0 |
> However, I wonder if the project could write and distribute a "seti stress > test program" that one could run to check their computer. The program could > cycle on "Seti like calculations" using canned data so the expected results > would be consistent. A simple pass/fail message would suffice. I believe there was a test unit back in beta''' |
The worm that turned Send message Joined: 15 May 99 Posts: 100 Credit: 4,872,533 RAC: 0 |
AAAAGGGGHH This thread is driving me crazy It's a Berkeley Hardware or software problem I tell you. visit boinc@Australia |
JAF Send message Joined: 9 Aug 00 Posts: 289 Credit: 168,721 RAC: 0 |
> I think the project itself is the stress test. > Yes, the project is the stress test. But there are so many variables with different work units, processors, operating systems, and Internet connections, that it's hard for one to narrow down a potential problem with their system or the project. What I'm thinking of, if one starts getting invalid results, one could run a floating point test that performs "seti like calculations" with a set data package (internal to the program and the same for all platforms). The program would crunch the data and compare the results. Run it long enough to get the system temperatures up and see if errors occur or not. Then if there are errors, there's an indication that you can't process Seti data on your system without error. No project "detaches or resets" just to see if the WU's are the problem. If you get errors, post a message about your system and you will probably get a response from someone with the same setup that doesn't get errors running the test program. I know there are a lot of system diagnostic/performance programs out there, like SIS Sandra, Passmark, Performance Test, etc. But they are not consistent for each processor and operating system. I'm thinking of a much simpler test that duplicates Seti calculations so each and every system should give a pass/fail indication. Rule out the uncontrollable data like WU's and Internet connections. |
Sir Ulli Send message Joined: 21 Oct 99 Posts: 2246 Credit: 6,136,250 RAC: 0 |
I have no Probs with 3 Hosts here... Greetings from Germany NRW Ulli S@h Berkeley's Staff Friends Club m7 © |
Sir Ulli Send message Joined: 21 Oct 99 Posts: 2246 Credit: 6,136,250 RAC: 0 |
I have no Probs with 3 Hosts here... Greetings from Germany NRW Ulli S@h Berkeley's Staff Friends Club m7 © |
Sir Ulli Send message Joined: 21 Oct 99 Posts: 2246 Credit: 6,136,250 RAC: 0 |
I have no Probs with 3 Hosts here... Greetings from Germany NRW Ulli S@h Berkeley's Staff Friends Club m7 © |
texasfit Send message Joined: 11 May 03 Posts: 223 Credit: 500,626 RAC: 0 |
> I think the project itself is the stress test. > > The cool bit would be adding an "index of accuracy" of some sort, maybe as > simple as saying "on average, 0.1% of all results get 0 credit, for your > machine, 20% of all results get 0 credit." > > Ned I agree with your analysis about certain unstable overclocked systems. From the failed wu's that I have researched however, there has always been a connection with either a hardware / software related problem at or close to that time period. Many of the people that have been getting these 0 credit errors do not have their systems overclocked. I have personnally only had 7 wu's show this 0 credit with an Invalid result and that is on all three of my systems the are currently running Seti Boinc. All 7 of these I traced to a Berkeley hardware or software issue. I am sure that I could very easily have had many many more as did some others but usually caught the news and disabled network access or whatever I felt necessary at that time. Rachel, is the perfect example of this senerio. If I recall, see is running a Dell 2.8Ghz system that her hubby bought for her. The wu's she was having problems with were during the last upload/download directory hard lock. Berkeley had to reboot that server and sync the directory. Rachel did not have any of these problems before this or after she reset the project. I think that most of us have the main objective with the science in mind but if there are some wu's downloaded that can not be returned with valid results then they are of no value to the project or the seti participant. I think the Boinc Team is doing an outstanding job with correcting the bugs and dealing with the hardware they have on hand at this time. Keep up the good work Boinc Team and we'll just keep on crunching those units for you. ----------<br> <img src=\"http://boinc.mundayweb.com/seti2/stats.php?userID=924&trans=off\"><br> <a href=\"http://setiweb.ssl.berkeley.edu/team_join_form.php?id=30199\">Join</a> the <a href=\"http://ocforums.com\">Overclockers.com</a> SETI Team! |
texasfit Send message Joined: 11 May 03 Posts: 223 Credit: 500,626 RAC: 0 |
> texasfit, > > Thanks for your summary of the old tech news briefs from that period. I > suspect that the wu's in question fall somewhere within the server problems > back then. The missing/lost credits are not a big deal. Out of curiosity, > where did you pull those old tech notices from? > > vvinyyc The link for the news is on the Main Home Page below current news. Just hit the link at the bottom of current news called '....more' |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I've been reading Rachel's posts for a while and I can't really spot a correlation between system wide problems and her complaints over the past weeks. Not picking on Rachel, just looking at the data. A problem like this is either going to affect everyone equally, or it's going to be specific to certain machines (or users). So, let's take something like a 30 day window. If the project receives a million work units back and 5000 of those get zero credit, then project-wide we're losing 0.5%. If we do the same calculation for each host seperately, that should come out somewhere around 0.5% as well. If we instead calculate a 20% loss, then it's time to take a good hard look at that computer. That doesn't require test work units (that may not show the error), or a test long enough to spot one error in a trillion. > I agree with your analysis about certain unstable overclocked systems. > > From the failed wu's that I have researched however, there has always been a > connection with either a hardware / software related problem at or close to > that time period. Many of the people that have been getting these 0 credit > errors do not have their systems overclocked. |
texasfit Send message Joined: 11 May 03 Posts: 223 Credit: 500,626 RAC: 0 |
I am still not convinced it is a computer specific issue. Even though I agree with your statement: > >If we do the same calculation for each host seperately, that should come out >somewhere around 0.5% as well. If we instead calculate a 20% loss, then it's >time to take a good hard look at that computer. > Every computer downloads a different quantity of wu's on different days which would make it necessary to work your host calculation on a time specific variable. My P4-HT computers will download about 20 new wu's ever day or two vs your systems of only 3 ever day or two. This would mean that if I downloaded or uploaded during a problem time period of a day I would have 85% more units to handle. I checked some of your wu's that showed errors and all the units I checked with errors on your computers were related to hardware/software problems on those days. An example of yours would be Invalid's returned on 10/12-14 which was during the time of the: October 9, 2004 A problem with file uploads has existed for the last few days. October 12, 2004 - 21:00 UTC We just had an unexpected crash of the new replica database while we trying to format it. Bad disks? Bad cables? Who knows? But we're going to abandon the replica database project for now. October 12, 2004 We just had (about 21:00 UTC) an unexpected outage due to a disk crash. Everything is back up now. Your turn :-) |
Rachel Send message Joined: 13 Apr 02 Posts: 978 Credit: 449,704 RAC: 0 |
I think it was a bad batch of wu's.Everything is working great now,thanks Rach ......In Space No One Can Hear You Scream...... |
Tony Martin Send message Joined: 5 Dec 99 Posts: 91 Credit: 69,723 RAC: 0 |
I don't agree look at this WU http://setiweb.ssl.berkeley.edu/workunit.php?wuid=2998766 now tell me why my machine got 0 credit and was invalid and the other 2 were valid and got credit? My computer worked 16 hours to complete it and report it back to seti and all I got was 0 credit. The Wu completed with no errors that I can see yet it was marked invalid. Why?????? |
Rachel Send message Joined: 13 Apr 02 Posts: 978 Credit: 449,704 RAC: 0 |
> I don't agree look at this WU > http://setiweb.ssl.berkeley.edu/workunit.php?wuid=2998766 > now tell me why my machine got 0 credit and was invalid and the other 2 were > valid and got credit? My computer worked 16 hours to complete it and report it > back to seti and all I got was 0 credit. The Wu completed with no errors that > I can see yet it was marked invalid. Why?????? > > because it did not match the other 2 results?I had many like that too. ......In Space No One Can Hear You Scream...... |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> I don't agree look at this WU > http://setiweb.ssl.berkeley.edu/workunit.php?wuid=2998766 > now tell me why my machine got 0 credit and was invalid and the other 2 were > valid and got credit? My computer worked 16 hours to complete it and report it > back to seti and all I got was 0 credit. The Wu completed with no errors that > I can see yet it was marked invalid. Why?????? Tony, Your Result was marked as "Invalid" and thus failed to compare with the Canonical ("ideal" or "Example" result for this WU) and tehrefore gathered no credit. Since there is no specific error for this WU in the error text all we can do is to guess as to the exact reason for the failure. Worm, > AAAAGGGGHH > > This thread is driving me crazy > It's a Berkeley Hardware or software problem I tell you. And it may very well be a problem on the back end. On the other hand, it may not be an error anywhere. One of the nicer parts of the SETI@Home on BOINC is that we have visibility like never before. In the old days you had no idea if what you were doing was of any value at all. Yes you did get a "bean" for each WU processed, but, no idea of what your contribution really was to the science. We have that now, but the freedom and information does come with its own price... |
The worm that turned Send message Joined: 15 May 99 Posts: 100 Credit: 4,872,533 RAC: 0 |
Paul, All i'm saying is don't get everyone worried that their computer systems are faulty, when all the evidence is pointing to a Berkeley hardware failure of some kind. Some of the earlier posts in this thread were plainly going along that track. visit boinc@Australia |
Mr Slick Send message Joined: 18 May 01 Posts: 7 Credit: 113,202 RAC: 0 |
i seem to loose THOUSANDS of credits look here http://setiweb.ssl.berkeley.edu/results.php?hostid=205192&offset=60 I get this message on one computer.. SETI@home - 2004-11-01 13:59:18 - Scheduler RPC to http://setiboincdata.ssl.berkeley.edu/sah_cgi/cgi failed But on another computer i see... SETI@home - 2004-11-01 13:59:26 - Scheduler RPC to http://setiboincdata.ssl.berkeley.edu/sah_cgi/cgi succeeded I have many data units completed between 3 computers, but for a week now, no connections. All 3 computers have version 4.13 one works fine two do not connect right now. I have tried to swap the program folder to the computer that i know connects, and the same problem exsists, now the one that did connect wont upload, so i know it is not the individual computer. I had the same problem with 4.09 version. When the computers finally uploads the files (sometimes after several weeks), i get no credit for the 'old' files. I have no firewall issues, or connectivity problems. And usually this only seems to happen to one computer at a time. I know there is nothing i have done wrong here, but it is frusterating when i loose thousands of credits. <img src=\"http://boinc.mundayweb.com/seti2/stats.php?userID=1548\"> |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
> Paul, > All i'm saying is don't get everyone worried that their computer systems > are faulty, when all the evidence is pointing to a Berkeley hardware failure > of some kind. > Some of the earlier posts in this thread were plainly going along that track. I did not think that I was doing that. But I conceed the point that I may have given that impression. All I am saying is that there are many reasons for the 0 credit WU/Result. The possible causes are many and so pointing only to one possible cause is not a complete listing of the possibilities. Possible causes include (but are not limited to): 1) Errors in the back-end software (validator, assimulator, etc.) 2) Errors in the BOINC Software 3) Errors in the SETI@Home science application 4) Errors caused by differences in compilers used for the various OS 5) Errors cause by client side hardware problems (Flakey memory, overclocking, etc.) 6) Errors caused by differences in the Floating Point hardware (FPU) 7) Errors caused by faulty uploads and downloads 8) Errors caused by back end hardware failures 9) Errors caused by magic and karma To seize upon only one of these possibilities as a source of all problems is not,um, good. Many of the Results that I have looked at are marked as "Invalid" and that could be caused by any one of the possibilities I listed above. All we know is that it is marked as invalid but we do not know why. The simple explanation is that this result (when so marked) did not compare with the others and in the voting scheme used, they lose. Note that this does not mean that the work unit is bad, or even that the singleton result is not the correct one. This is why a project may do a "World Series" type of compare by running the same WU more than thrice and demanding a much higher coorelation between Results before they will consider that they have the proper answer. As you have noted, one of the reasons for errors is the hardware failures we have seen on the back end systems. But the highest probability of the occurance of problems lies in the domain of the Participant's computer because it is not as controlled as the other aspects. Many participants have low cost machines, others wanting higher performance may over-clock them, and so forth ... |
The worm that turned Send message Joined: 15 May 99 Posts: 100 Credit: 4,872,533 RAC: 0 |
Paul I wasn't refering to you In regard to getting people worried about their computer systems failing. I said some of the earlier posts. I do find it hard to understand though why you can't accept that in most cases the current problem with zero credits is a Berkeley hardware or software error. This problem started to appear quickly on crunchers using different systems all around the world.I know we will always find the odd instance of zero credit appearing to any of us but this was something else. Logic tells me it can't be individual computer errors and thus we should look to Berkeley first. The problem (whatever it was/is) seems to be feeding out of the system now and I expect people to see instances of zero credit falling to normal levels. PS By the way Paul your'e a legend down here in oz and a credit to the forum don't ever stop visit boinc@Australia |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.