Message boards :
Number crunching :
Error
Message board moderation
Author | Message |
---|---|
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Just noticed one machine had a problem yesterday and errored 97 tasks. Any one know what this error is? http://setiathome.berkeley.edu/result.php?resultid=2832633964 |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Were all these GPU WUs? were they all consecutive - did they all die rapid fire, one after the other? Just a guess - did your graphics card have a problem? Maybe the fan is failing and the card overheated? Try rebooting your machine and see if that cures it. Try getting Speedfan (freeware) and checking the temps. |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Yes they were all GPU and yes they all errored in seconds. First thing I did was reboot. I am running GPU Tweak and it reports a steady 65ºC at 30% fan |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
OK, I'm 0 for 1 so far. Had you done anything to the environment like install new graphics drivers? A newer version of BOINC? Are you running the stock apps or Lunatics? (I'm thinking maybe some sort of CUDA library mismatch that has been discussed here recently - I don't remember the exact situations - search the forums here and try to find it) |
Glen Send message Joined: 5 Feb 13 Posts: 8 Credit: 26,875 RAC: 0 |
just a small problem i have the setting for seti set not receve Astropulse units especially the cuda ones as i have already lost 1 graphic card and i'm on a fixed income and can not afford to replace hardweare every couple of months but this morning while updateing seti it started to download a Astopluse cuda unit why did this happen ?? i hope i can trust seti not to push my system to the point it breaks ????.If you have no normal units i do not wish to recive Astropluse i'll wait till you have more units and start doing another progect . |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
just a small problem i have the setting for seti set not receve Astropulse units especially the cuda ones as i have already lost 1 graphic card and i'm on a fixed income and can not afford to replace hardweare every couple of months but this morning while updateing seti it started to download a Astopluse cuda unit why did this happen ?? i hope i can trust seti not to push my system to the point it breaks ????.If you have no normal units i do not wish to recive Astropluse i'll wait till you have more units and start doing another progect . That will happen to a card if you stick them in a poorly ventilated case and/or let their cooling fan/s run to slowly. I'm a pensioner but that doesn't stop me from running my video cards at full bore though I do have them in very well ventilated cases and their fan speeds are set not to drop below 60%. Cheers. |
Glen Send message Joined: 5 Feb 13 Posts: 8 Credit: 26,875 RAC: 0 |
so why did seti send me a Astropluse unit when the settings are set not to ??? .Good on you for running your graphics crad full bore .For the record you can't have better ventilation than haveing the side covers removed which i have and i have been doning seti longer than you have bro i origanally joined in 1999 so i think i got a bit more expereance than you running seti . Your a Legand mate keep the good work up and keep a few bucks spear just incase that card of your does burn out . |
Glen Send message Joined: 5 Feb 13 Posts: 8 Credit: 26,875 RAC: 0 |
One last thing you will find my stats under Glenn i'm in 2019th place for the country with a points score of 350,000+ before the graphic card shit it's self and i stoped doing it and rejoined only recently as a new user your stats tell me you have a computer set up to do only Bionic doing 67,000 points a day where i use my 1 computer for everything not just seti and Astropulse hangs the system too much to be able to use, opening client takes 3 seconds but normal cuda units don't do this i only have troubble if i'm waching a movie so i turn the gpu off but leave the cpu's running with no prob's. I'm not a newbee or somebody that doesn't know much about computers i'm very knolegeable and have formal training . |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
so why did seti send me a Astropluse unit when the settings are set not to ??? .Good on you for running your graphics crad full bore .For the record you can't have better ventilation than haveing the side covers removed which i have and i have been doning seti longer than you have bro i origanally joined in 1999 so i think i got a bit more expereance than you running seti . Your a Legand mate keep the good work up and keep a few bucks spear just incase that card of your does burn out. As to why you were sent AP's I cannot say but better than having the side off is to have a 200mm side fan in the side blowing over 2 or 3 cards, my main 2 rigs are housed in CoolerMaster 942HAF cases so ventilation is well above just having a side off. Plus I have done SETI since it 1st started but the original account was lost due to an expired email addy but that's another story and the old AMD 33MHz 386 couldn't really do much so those few months were really no loss. Actually I'm considering another 2 new video cards myself so that I can retire my old E6300 and 3 old 9800GT's. One last thing you will find my stats under Glenn i'm in 2019th place for the country with a points score of 350,000+ before the graphic card shit it's self and i stoped doing it and rejoined only recently as a new user your stats tell me you have a computer set up to do only Bionic doing 67,000 points a day where i use my 1 computer for everything not just seti and Astropulse hangs the system too much to be able to use, opening client takes 3 seconds but normal cuda units don't do this i only have troubble if i'm waching a movie so i turn the gpu off but leave the cpu's running with no prob's. I'm not a newbee or somebody that doesn't know much about computers i'm very knolegeable and have formal training . Well I won't tell you my position the country as it may be to much for you. :) I will tell you that SETI is my main project though due to supply & demand I have 3 other backup projects to take up the slack when SETI can't keep me supplied with work which seems to be a day or 2 each week these days, especially for the GPU's but my 2500K is also hard to kept fed. I do pause my crunching when doing audio/video editing as my programs both CPU and GPU's but for anything else I let them keep going and I do use all 3 every day so none are totally dedicated crunchers. I will say that if I did do OpenCL AP's on video card I would at least reserve a CPU core just to feed them so that lag wouldn't be a problem. I was not trying to dis you and I have no formal training but I have been building my own PC's since well before SETI started. I also noticed that you don't belong to a team but if you do want to join one then please consider team Australia. Cheers. |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
so why did seti send me a Astropluse unit when the settings are set not to ??? Since you didn't copy/post the exact settings we can only wonder how are they set. Instead of "the settings are set not to" more informative is to just copy the lines, e.g.: SETI@home Enhanced: yes Astropulse v505: no SETI@home v7: no AstroPulse v6: yes If no work for selected applications is available, accept work from other applications? yes And without exact information from you I can only guess you forget to set: If no work for selected applications is available, accept work from other applications? NO P.S. To protect your CPU/GPU from overheating use TThrottle http://www.efmer.eu/boinc/ Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Seems like I have more problems, all the GPU units in this machine error. http://setiathome.berkeley.edu/results.php?hostid=5708417&offset=0&show_names=0&state=6&appid= I did a re-boot when I saw the problem then noticed this in the log SETI@home 19/02/2013 22:25:09 Aborting task 08dc12af.32581.62739.7.10.36_0: exceeded elapsed time limit 2143.66 (53690.16G/25.05G) I have just changed to BOINC 7.0.52, I have 4 other machines using it and not reporting errors Any help wpuld be appreciated |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Seems like I have more problems, all the GPU units in this machine error. This is because of a change in Boinc 7.0.33, the flops value for GPU tasks gets increased by by times 10 (when no flops values have been supplied in the app_info), existing GPU tasks will be put on the verge of going Maximum time exceeded: client: when estimating FLOPS for an anonymous-platform app version for which no estimate has been supplied by user, use (CPU speed)*(cpu_usage + 10*gpu_usage) (--> add the 10*) Fresh GPU tasks will get revised <rsc_fpops_est> and <rsc_fpops_bound> values and will complete O.K, you can eithier reset the project and get your tasks resent, or use the -177 protection option in BoincRecheduler (but that'll mess up the time estimates) Claggy |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Thanks I knew it had to be 7.0.52 but did not know what!! Why did it only affect this machine? PS I did a reset and the tasks are being downloaded now. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Thanks I knew it had to be 7.0.52 but did not know what!! It has effected both machines that run anonymous platform and Boinc 7.0.52: Error tasks for computer 5708417 Error tasks for computer 6851722 the other machine is running the Stock apps, the project supplies a flops value for that, Claggy |
Floyd Send message Joined: 19 May 11 Posts: 524 Credit: 1,870,625 RAC: 0 |
I guess this is the ERROR Thread... So I have this "Error" http://setiathome.berkeley.edu/result.php?resultid=2842392732 http://setiathome.berkeley.edu/result.php?resultid=2842392732 That same WU has 2 inconclusives and 2 error while computing... ??? Maybe just a corrupted file ? |
Terror Australis Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44 |
....That same WU has 2 inconclusives and 2 error while computing... ??? The answer is in the bottom line of the stderr_txt output file. cudaAcc_find_triplets doesn't support more than MAX_TRIPLETS_ABOVE_THRESHOLD numBinsAboveThreshold in find_triplets_kernel" Yes, they are noisy units :-) T.A. |
William Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0 |
I guess this is the ERROR Thread... It's not a generic error thread, Bernie just wasn't very specific with his thread title ;) That WU shows very nicely where we stand with the optimised CUDA app. The error itself is as old as the CUDA app. I call it a bug (because it throws an error), Jason begs to differ and calls it a design flaw. The original CUDA application can not handle certain triplet conditions and will error with a -12. You are running x41g where Jason already removed quite a few (but not all) of the -12 causes. Host 6269362 belonging to Juan is running the latest x41zc, which IIRC does not do any -12 errors. Why did the unit go inconclusive then? If you look at stderr you'll see that the CPU found 31 pulses and the GPU 26 pulses and 5 triplets. Both WUs reported as -9 (overflow) but they reported a different subset of signals because of different processing order between the CPU and the GPU app. As I said a very nice example of how the app matured and what the issues left for further development are. William the Silent A person who won't read has no advantage over one who can't read. (Mark Twain) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.