Message boards :
Number crunching :
Could 6.12.33 be the cause for my invalids?
Message board moderation
Author | Message |
---|---|
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
From the minute I upgraded to the new 6.12.33 BOINC version my Lunatics ATI Apps (both SETI and Beta) started to last only about 20 seconds and get marked as invalid (at least the ones that aren't pending anymore). Of course it could be my GPU, but it would be a weird conincidence to produce failures right after the upgrade, didn't have a single invalid before. (Temperature of the GPU is even less than what it is used to.) The strangest thing is the stderr output of these tasks (eg http://setiathome.berkeley.edu/result.php?resultid=1983386504) is only one line: <core_client_version>6.12.33</core_client_version> Anyone else experiencing this? Any clues/suggetions? Also, is there anything to watch out for when downgrading BOINC? "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Mike Send message Joined: 17 Feb 01 Posts: 34262 Credit: 79,922,639 RAC: 80 |
I dont think its Boinc but downgrade just to make sure. Just install old version over the top. I´ve seen out of memory message in one unit. I guess its the card. With each crime and every kindness we birth our future. |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
Thanks Mike, I'll try the downgradge. @Out of memory: I'm pretty sure that's another issue which I'll post in a few minutes. ;) "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
You were right, it wasn't BOINC. Still getting these 20 seconds tasks. Are these 'normal' super-shorties? But that would still not explain the stderr (now with the older version number of course): <core_client_version>6.12.26</core_client_version> "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
have you tried a reboot, yet? In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
Yes, rebooted before starting this thread. "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Mike Send message Joined: 17 Feb 01 Posts: 34262 Credit: 79,922,639 RAC: 80 |
Which settings are you running. Did you change something in appinfo.xml ? Do you have something in autostart ? With each crime and every kindness we birth our future. |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
Settings: -period_iterations_num 14 -instances_per_device 1 (or did you mean other settings?) I didn't change the appinfo.xml for a week. Autostart: Do you mean what is running? Nothing new before this started...IE9, WMP12, WLM2011, Trillian, Skype, MSE and CCC. Just 'found' a task which would get over the 20 seconds and finished in an hour including a stderr: http://setiathome.berkeley.edu/result.php?resultid=1987166007 (perhaps there is still nothing out of the ordinary) ...so not every task is affected, however one that looks just like it has the same issue again: http://setiathome.berkeley.edu/result.php?resultid=1987166005 "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Mike Send message Joined: 17 Feb 01 Posts: 34262 Credit: 79,922,639 RAC: 80 |
Something is consuming your resources maybe. Did you have a virus scan lately ? Maybe Malware or spyware. Make sure Boinc data directory is excluded from virus scanner. With each crime and every kindness we birth our future. |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
As I expected, Virus scan was negativ. > Something is consuming your resources maybe. GPU resources? Other than Windows Aero and IE9? Well not that the CCC or GPU-Z would show, cause when BOINC is suspended the GPU shows 0% workload @157/300 (minimum MHz). Anyway what bad would that do? There were days when I played Star Trek Online while everything including GPU was crunching without error/problem. > Make sure Boinc data directory is excluded from virus scanner. Makes no different regarding this issue. In any case I won't be able to track this the next week, cause I'm gonna be in Graz cheering for Austria @ the Amercian Football World Championship. ;) "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Looking through your tasks, it looks as if MB r177 worked when you had Cat 11.5 (CAL 1.4.1385) drivers installed and hasn't worked since you had Cat 11.6 (CAL 1.4.1417) installed, So try uninstalling Cat 11.6, and installing Cat 11.5 again, Claggy |
Slavac Send message Joined: 27 Apr 11 Posts: 1932 Credit: 17,952,639 RAC: 0 |
|
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
OP I'd also suggest upgrading to the Lunatics 39 build, completely killed any invalids I was having. It's a good app indeed, but won't help his ATi card. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
W5GA, W5TAT, W8QR, K6XT Send message Joined: 25 Sep 99 Posts: 42 Credit: 23,144,377 RAC: 6 |
OP I'd also suggest upgrading to the Lunatics 39 build, completely killed any invalids I was having. Is this installed automatically by Lunatics unified installer 0.38? |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
No, the installer gives you the x38g build. This thread will lead you to the x39e build... http://setiathome.berkeley.edu/forum_thread.php?id=64739&nowrap=true#1125635 You have to make some changes to the app_info file. PROUD MEMBER OF Team Starfire World BOINC |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
Thanks for the suggetions while I was away. Looking through your tasks, it looks as if MB r177 worked when you had Cat 11.5 (CAL 1.4.1385) drivers installed and hasn't worked since you had Cat 11.6 (CAL 1.4.1417) installed, I thought about the driver version myself and was sure I installed them at least a week before the problems. Anyhow I reverted back to Cat 11.5 to test this, of course. No change. A couple of tests I ran myself with the last 6 workunits suggest the weird conincidence I mentioned: a hardware issue. <_< It seems when the workunit will fail it will be at the beginning, after that it seems to be running fine. With these 6 workunits the failure rate is about 50% (assuming the ones running through would valided). The failing seems to be independent from the workunit, the load on the CPU and the GPU temperature, because there where failure and success in each case. 'Failing' will also send something back of course, however, the XML lacks a few starting tags which would explain the empty stderr. Furthermore game play won't show any kinds of (graphical) error, the same with GPU stability tests which resulted only in higher temperature than while cruching but still not showing anything wrong. I found a memtest for CUDA...does anyone know one for ATI/OpenCL? "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
The Astropulse Wu ran O.K, i wonder if it's your settings for the r177_HD5 app, can you post that section of the app_info please, Claggy |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
<app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>0.05</max_ncpus> <plan_class>ati13ati</plan_class> <cmdline>-period_iterations_num 14 -instances_per_device 1</cmdline> <coproc> <type>ATI</type> <count>1</count> </coproc> <file_ref> <file_name>MB_6.10_win_SSE3_ATI_HD5_r177.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>MultiBeam_Kernels_r177.cl</file_name> <open_name>MultiBeam_Kernels.cl</open_name> <copy_file/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <platform>windows_x86_64</platform> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>0.05</max_ncpus> <plan_class>ati13ati</plan_class> <cmdline>-period_iterations_num 14 -instances_per_device 1</cmdline> <coproc> <type>ATI</type> <count>1</count> </coproc> <file_ref> <file_name>MB_6.10_win_SSE3_ATI_HD5_r177.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>MultiBeam_Kernels_r177.cl</file_name> <open_name>MultiBeam_Kernels.cl</open_name> <copy_file/> </file_ref> </app_version> They only thing I changed from the original the Lunatics Installer created was the -period_iterations_num. "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
They only thing I changed from the original the Lunatics Installer created was the -period_iterations_num. The app_info section looked fine, when i first ran r177_HD5 my HD5770 only needed -period_iterations_num 2 I can't remember what driver/SDK that was on, probably Cat 10.10/SDK2.2, I'd just try -period_iterations_num 2 first, if no improvement downgrade your Driver to Cat 11.2/SDK2.3, there has been some changes in SDK2.4 that increased the amount of Memory available to OpenCL apps, there might be side effects, The other things that are strange is your HD5770 is only reporting 9 Compute Units at 700MHz, my HD5770 has 10 Compute Units at 850MHz Claggy |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
Neither -period_iterations_num 2 (or other values) nor downgrading to Cat 11.2/SDK2.3 seem to make any difference. I've got the HD5750 which has only 9 Compute Units @ 700 MHz. "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.