I am puzzled...

Author	Message
Seahawk Volunteer tester Send message Joined: 8 Jan 08 Posts: 937 Credit: 8,157,029 RAC: 5	Message 1202091 - Posted: 3 Mar 2012, 14:51:09 UTC I am puzzled by the behavoir of my Q6600 system overnight. Lastnight when I last looked all the ATI WU's had a remaining time of 7 hours or so. This morning they are all 90+ hours. The CPU WU's are showing 1-3 hours remaining which looks to be the same as lastnight. Everything in the log looks ok to me and work is being done in expected amount of time. Just not sure why the DCF for the GPU went whacko. I used to be a cruncher like you, then I took an arrow to the knee. ID: 1202091 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1202093 - Posted: 3 Mar 2012, 14:56:43 UTC You had one unit running over 16000 seconds. With each crime and every kindness we birth our future. ID: 1202093 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1202094 - Posted: 3 Mar 2012, 14:59:52 UTC - in response to Message 1202093. You had one unit running over 16000 seconds. And several VLAR running over 25,000 seconds - like task 2334375975. ID: 1202094 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1202098 - Posted: 3 Mar 2012, 15:03:34 UTC - in response to Message 1202094. You had one unit running over 16000 seconds. And several VLAR running over 25,000 seconds - like task 2334375975. Sounds like the ATI low GPU use Bug, Claggy ID: 1202098 ·

Seahawk Volunteer tester Send message Joined: 8 Jan 08 Posts: 937 Credit: 8,157,029 RAC: 5	Message 1202099 - Posted: 3 Mar 2012, 15:07:07 UTC Ok, so they should come back down over time? The system has only been up and crunching for 38 hours since it got overhualed. I know the 2 ATI 6670 GPUs aren't the fastest crunchers but 90 hours is alittle out of the ballpark. I used to be a cruncher like you, then I took an arrow to the knee. ID: 1202099 ·

Seahawk Volunteer tester Send message Joined: 8 Jan 08 Posts: 937 Credit: 8,157,029 RAC: 5	Message 1202101 - Posted: 3 Mar 2012, 15:09:07 UTC How do I check for this bug and is it correctable? I used to be a cruncher like you, then I took an arrow to the knee. ID: 1202101 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1202107 - Posted: 3 Mar 2012, 15:21:32 UTC The Application details for that host are showing a pretty insane APR for the ATI app. I'll leave it to the ATI specialists to decide how plausible it is. ID: 1202107 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80	Message 1202117 - Posted: 3 Mar 2012, 15:44:13 UTC - in response to Message 1202101. How do I check for this bug and is it correctable? Not much you can do. First thing is to restart the computer. Reduce instances to 3 better would be 2 on your card. Each time you notice a slow down suspend GPU for 30 seconds and resume again. With each crime and every kindness we birth our future. ID: 1202117 ·

janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0	Message 1202210 - Posted: 3 Mar 2012, 22:31:12 UTC I have the same problem with ATI GPU. The APR value showing in the Application details page is insane. But if you want the remaining time in BM to show more accurately for your GPU there is a way. In your BOINC folder C:\ProgramData\BOINC\projects\setiathome.berkeley.edu there is a file app_info.xml. Add a new entry <flops>16000000000</flops> to the <app_version> section. The value, in my case 16 GFlops, will probably have to be adjusted so it will match your GPU. Here is a snippet from app_info.xml. <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>0.05</max_ncpus> <flops>16000000000</flops> <plan_class>ati13ati</plan_class> <cmdline>-period_iterations_num 20 -instances_per_device 1</cmdline> <coproc> <type>ATI</type> <count>1</count> </coproc> <file_ref> <file_name>MB6_win_x86_SSE3_OpenCL_ATi_HD5_r390.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>MultiBeam_Kernels_r390.cl</file_name> <copy_file/> </file_ref> </app_version> ID: 1202210 ·

arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1202230 - Posted: 4 Mar 2012, 0:18:03 UTC - in response to Message 1202210. Actually you want it in a different format than that. it should look something like this. <flops>200.52843553287e09</flops> That is for my GPU's. ID: 1202230 ·

janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0	Message 1202235 - Posted: 4 Mar 2012, 0:46:04 UTC - in response to Message 1202230. Actually you can use both formats to give this value. ID: 1202235 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1202237 - Posted: 4 Mar 2012, 0:50:49 UTC - in response to Message 1202230. Last modified: 4 Mar 2012, 1:28:57 UTC Actually you want it in a different format than that. it should look something like this. <flops>200.52843553287e09</flops> That is for my GPU's. According to Joe Segur, the parser that handles that value can cope with most reasonable numeric and scientific formats. You do, however, need to keep your wits about you when dealing with numbers as large as that. Not many of us (except the bankers) regularly deal with a couple of hundred billion - of anything, even flops. If you're used to exponential notation, 200e9 is indeed easier to get right than 200000000000. But then, why not 2e11? For this purpose, the minute fractional bits after the decimal point really don't matter. There is a nice example of this in today's New Scientist magazine. The website is subscription-only, so I'll have to quote instead of link: BANKS are under orders to tighten their accounting, but we fear HSBC may be paying too much attention to the small things rather than the gigapounds. Andrew Beggs wanted to know the distance from his home to the bank's nearest branch. He consulted the bank's website, which came up with the answer 0.9904670356841079 miles (1.5940021810076005 kilometres). This is supposedly accurate to around 10-13 metres, much less than the radius of a hydrogen atom. Moisture condensing on the branch's door would bring it several significant digits closer. ID: 1202237 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1202321 - Posted: 4 Mar 2012, 10:28:57 UTC - in response to Message 1202237. Last modified: 4 Mar 2012, 10:31:46 UTC Testing superscripts for DA. BANKS are under orders to tighten their accounting, but we fear HSBC may be paying too much attention to the small things rather than the gigapounds. Andrew Beggs wanted to know the distance from his home to the bank's nearest branch. He consulted the bank's website, which came up with the answer 0.9904670356841079 miles (1.5940021810076005 kilometres). This is supposedly accurate to around 10^-13 metres, much less than the radius of a hydrogen atom. Moisture condensing on the branch's door would bring it several significant digits closer. Superscripts seem OK, but don't click the Use BBCode tags to format your text link while editing - you'll lose your changes. I'll tell him. ID: 1202321 ·

tullio Volunteer tester Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1	Message 1202325 - Posted: 4 Mar 2012, 11:00:32 UTC I can read many articles in New Scientist without even registering. I am a registered non paying guest in Nature magazine and can read all its editorials and even some articles, same in Nature Communications. Tullio ID: 1202325 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1202327 - Posted: 4 Mar 2012, 11:16:40 UTC - in response to Message 1202325. I can read many articles in New Scientist without even registering. I am a registered non paying guest in Nature magazine and can read all its editorials and even some articles, same in Nature Communications. Tullio I tried the section link on the front page for that story (Feedback: Highway exit with no return) before posting, but it told me I had to log in first. Not everybody here will want to create even a guest registration just to read one silly story. ID: 1202327 ·

tullio Volunteer tester Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1	Message 1202328 - Posted: 4 Mar 2012, 11:28:25 UTC - in response to Message 1202327. I have just read and printed a short article "Oceans acidifying at unprecedented speed". I grab what I can without registering from New Scientist, but Nature is a more reliable source. Tullio ID: 1202328 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1202411 - Posted: 4 Mar 2012, 18:13:59 UTC - in response to Message 1202107. The Application details for that host are showing a pretty insane APR for the ATI app. I'll leave it to the ATI specialists to decide how plausible it is. So it was indicating 1.4 TeraFlops and has since grown to 2.2 TeraFlops. I'm no ATI specialist but need not be to say that's ridiculously high. The cause is result_overflow tasks which haven't been marked as runtime_outlier by a sah_validate process, because the BOINC core client API all too often truncates the stderr. Task 2334954485 is a current example though due to be purged soon. It has Run time 39.31, CPU time 30.91, Credit 0.33, but only the first 8 lines of stderr.txt captured and the result_overflow line would be much later. The Validator has always looked for that result_overflow keyword so the assimilated result is marked, and now it is also used to tell BOINC not to include the runtime in its averages. In January, Matt Arsenault (Milkyway) noted on the boinc_dev list that about 2% of reports had truncated stderr sections, and provided a patch to fix the problem. Dr. Anderson apparently wasn't convinced it needs fixing, and even if he had implemented the patch it would only be effective for those alpha testing BOINC 7.0.x clients. As a related note, the runtime_outlier detection for Astropulse v6 will be based on information in the uploaded result file rather than the stderr which is supposed to go back in requests to the Scheduler. Perhaps it would be a good idea to make a similar change for SETI@home v7. Joe ID: 1202411 ·

Seahawk Volunteer tester Send message Joined: 8 Jan 08 Posts: 937 Credit: 8,157,029 RAC: 5	Message 1202422 - Posted: 4 Mar 2012, 18:56:22 UTC I dropped to 2 task per GPU and the times have dropped from 90 hours to around 30. This is still 5-6 times higher than I have seen most task finish in. I'll just let it keep crunching and watch it. I used to be a cruncher like you, then I took an arrow to the knee. ID: 1202422 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1202477 - Posted: 4 Mar 2012, 21:34:16 UTC - in response to Message 1202321. Superscripts seem OK, but don't click the Use BBCode tags to format your text link while editing - you'll lose your changes. I'll tell him. It's safe to use again now, though I'm not sure I remember all the page headers being in the formatting pop-up before. ID: 1202477 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.