Message boards :
Number crunching :
I am getting a lot of gpu tasks with zero (0) expected processing times.
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next
Author | Message |
---|---|
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
You are not muddying the waters any more than they already are Bill. The code is a definite patchwork by DA over many years. It is very fragmented code and very hard to follow. Everyone that looks at the code says the same thing.I haven't tried an older driver, but I don't think that is possible now. I needed to update to 18.10.20 in order to use the current BIOS version for my motherboard. I did that before installing BOINC, so I didn't get to see how it performed prior to that. Regardless, MB works fine, so I would agree that it is a code issue and not a driver issue. Seti@home classic: 1,456 results, 1.613 years CPU time |
rob smith Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380 |
I assume you've suspended the AP. A quick look shows that the peak_flops is still way too high. Since that is written very early in the life of the task there might be some more digging needed to a) where it is written and b) exactly where the data used in its calculation comes from. What is even more peculiar is that Tom had the same problem on MBs and doing the same sort of patch (or at least I think he did) and it worked for him (I must check back and see if it was a temporary fix or if it's stuck). Checked on Tom's computer and it's still dropping loads and has a crazy peak_flops. Since Tom & you are showing different values that rules out one thought I had, but didn't mention earlier, if would have really confused everyone. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
I assume you've suspended the AP.Just to be sure, when you say "doing the same sort of patch", what do you mean exactly? I have not updated anything, I just restarted BOINC. I think I assumed from your previous posts that anything you changed would have been handled external to my computer. Seti@home classic: 1,456 results, 1.613 years CPU time |
rob smith Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380 |
Setting peak_flops to something a bit more sensible..... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
Setting peak_flops to something a bit more sensible.....Okay, now I finally see what you're saying. I'll edit peak_flops in the coproc_info.xml file tonight when I get a chance and see what happens. I really appreciate the help, Rob. Sorry for being slow with picking up what you're dropping. I think as I get more familiar with BOINC I'll be less of a pain. Seti@home classic: 1,456 results, 1.613 years CPU time |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
when i'm looking on my app details , i saw some averages Gflops per apps , can we use it to estimate the flop / gpu used ? theorically, HD7750 have ~1500 gflops, Boinc give ~2048 Gflops but the autotuned AP app give ~500Gflops ^^ |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Setting peak_flops to something a bit more sensible.....Okay, now I finally see what you're saying. I'll edit peak_flops in the coproc_info.xml file tonight when I get a chance and see what happens. I really appreciate the help, Rob. Sorry for being slow with picking up what you're dropping. I think as I get more familiar with BOINC I'll be less of a pain. I did edit that and it doesn't appear to have made a difference :( I still get a lot of 0 expected processing time gpu tasks. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
If the change to flops isn't being picked up in coproc_info.xml, then you will have to make the change in the client_state.xml file for the Astropulse entry in app_version entry for the gpu app. [Edit] Thought a more permanent place to put the flops statement would be into the gpu section of the app_info.xml file. That one won't get changed by the project at every connection and will just be read. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
rob smith Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380 |
Thanks for the feedback Tom, I had a feeling that would be the case :-( (Looking at the code again I can see that these values are re-set on restart, and are over-written regularly during operation PROVIDED tasks complete in a "timely" manner and without error. Keith's description of how convoluted the code is pretty accurate, there are so many patches and work-arounds that it is very difficult to actually see the flow through - and I do know that there are a lot of "required" features that are there for very good reasons.) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
If the change to flops isn't being picked up in coproc_info.xml, then you will have to make the change in the client_state.xml file for the Astropulse entry in app_version entry for the gpu app. As you know I am fumbled fingered on stuff I don't use regularly. Could you show me an example for the app_config.xml file? Thanks, Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Not the app_config file but the app_info file. You need to put your halved flops value into the flops field in the app_version section of the ATI gpu section. Since I don't know the structure of the ATI cards app_info I don't know the specifics. This is the relevant document explaining where the flops statement goes. https://boinc.berkeley.edu/wiki/Anonymous_platform Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I would just alter the flops value for the meantime. You will have to wait until DA fixes the issue but at least it has his attention now. He has opened a bug on the client. Incorrect peak FLOPS for AMD GPU It can take quite a while for the code to be fixed and then a new BOINC released with the fix. I would not wait and just modify the flops for now. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
Not the app_config file but the app_info file. You need to put your halved flops value into the flops field in the app_version section of the ATI gpu section.I don't even see that I have an app_info file. I'm okay with that for now, though. I feel like I'm getting past my skis by editing flops values in files I am not familiar with in the least. I'm just going to exclude using the GPU for all AP tasks for now. This has inspired me to start understanding the files for how BOINC works, and to maybe get back into programming! Now if I could only find the time to dedicate to it... Thanks again to everyone who has helped with this problem! Seti@home classic: 1,456 results, 1.613 years CPU time |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Sorry, just made an assumption you were running anonymous platform. I have for so long and all my contacts and friends do also, I forget sometimes that people just run stock from the project mostly. You are out of luck till BOINC is fixed I guess. Let's hope that DA considers that bug urgent and provides new code and and new release in short order. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bill Send message Joined: 30 Nov 05 Posts: 282 Credit: 6,916,194 RAC: 60 |
No worries, I might run anonymous applications eventually once I get a better handle. I can be patient for now. Seti@home classic: 1,456 results, 1.613 years CPU time |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yes, a good first step on that journey would be the optimized Lunatics Installer and applications. http://mikesworld.eu/download.html That will get you an anonymous platform and the app_info you would need for the temp fix. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I would just alter the flops value for the meantime. You will have to wait until DA fixes the issue but at least it has his attention now. He has opened a bug on the client.David has been working with a SU user, and this morning put up a Windows test version accessible from https://ci.appveyor.com/project/BOINC/boinc/builds/22082918/artifacts (details in pull request #3001) Could somebody observing this problem please try out the fix? You simply need to download the win-client archive and replace boinc.exe in you BOINC Programs directory. Possible extra tools needed, if you don't have them already: A 7z archive program, like https://www.7-zip.org/ The Microsoft Visual C++ Redistributable Packages for Visual Studio 2013, from https://www.microsoft.com/en-US/download/details.aspx?id=40784 (or your local Microsoft site) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
@ Rob Smith, It is certainly the case that zero processing time estimates are caused by faulty GFlops Peak calculations in boinc.exe. In the case which David has been investigating, the device (an APU) is reported as having 1127 GFLOPS peak when running at normal speeds, but 43980464 GFLOPS peak when DOWN-clocked. Huh? David has interpreted this as an arithmetic overflow in https://github.com/BOINC/boinc/blob/master/client/gpu_opencl.cpp#L609: I'm suspicious. It would be interesting to know whether the same fault appears for CAL flops, or whether it's just OpenCL. My own initial expectation in #2988 was that we would need to be looking at https://github.com/BOINC/boinc/blob/master/lib/coproc.cpp#L840: that's where the driver API is queried, triggering in turn a driver query of the underlying hardware. Thoughts? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
OK, we've got a reply back from the SU user: 03/02/2019 13:21:40 | | <![CDATA[cc_config.xml not found - using defaults]]>So, David's fix didn't fix anything - don't bother testing any further. Since it was in a } else { //////////// OTHER GPU OR ACCELERATOR ////////////// // Put each coprocessor instance into a separate other_opencls elementsection, I'm not wildly surprised. But if anyone else can work their way through the spaghetti code, please pitch in. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
User has supplied their coproc_info: <ati_opencl> <name>AMD Radeon(TM) Vega 8 Graphics</name> <vendor>Advanced Micro Devices, Inc.</vendor> <vendor_id>4098</vendor_id> <available>1</available> <half_fp_config>0</half_fp_config> <single_fp_config>190</single_fp_config> <double_fp_config>63</double_fp_config> <endian_little>1</endian_little> <execution_capabilities>1</execution_capabilities> <extensions>cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_planar_yuv</extensions> <global_mem_size>6885523456</global_mem_size> <local_mem_size>32768</local_mem_size> <max_clock_frequency>42949672</max_clock_frequency> <max_compute_units>8</max_compute_units> <nv_compute_capability_major>0</nv_compute_capability_major> <nv_compute_capability_minor>0</nv_compute_capability_minor> <amd_simd_per_compute_unit>4</amd_simd_per_compute_unit> <amd_simd_width>16</amd_simd_width> <amd_simd_instruction_width>1</amd_simd_instruction_width> <opencl_platform_version>OpenCL 2.1 AMD-APP (2766.5)</opencl_platform_version> <opencl_device_version>OpenCL 2.0 AMD-APP (2766.5)</opencl_device_version> <opencl_driver_version>2766.5 (PAL,HSAIL)</opencl_driver_version> <device_num>0</device_num> <peak_flops>43980464128000000.000000</peak_flops> <opencl_available_ram>6885523456.000000</opencl_available_ram> <opencl_device_index>0</opencl_device_index> <warn_bad_cuda>0</warn_bad_cuda> </ati_opencl>It shows the same fault in <max_clock_frequency>, which may be helpful. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.