Questions and Answers :
Unix/Linux :
Exception when running AstroPulse OpenCL unit with Mesa/Clover
Message board moderation
Author | Message |
---|---|
Aaron Puchert Send message Joined: 28 Mar 08 Posts: 5 Credit: 432,715 RAC: 0 |
AstroPulse GPU work units fail on my machine because a string is constructed from a null pointer. The GPU is an AMD Radeon HD 8570M, known to Linux as AMD HAINAN. Instead of the proprietary driver from AMD (fglrx), I run the open source driver (radeon) with Mesa, which has an OpenCL implementation (Clover). The kernel sources didn't compile at first because the LLVM OpenCL compiler apparently doesn't support inlining. Hence I removed the "inline" specification of calc_chirp. Then the code compiles but AstroPulse fails (https://setiathome.berkeley.edu/result.php?resultid=4844090137) with the following error: terminate called after throwing an instance of 'std::logic_error' what(): basic_string::_S_construct null not valid SIGABRT: abort called Stack trace (18 frames): ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100(boinc_catch_signal+0x4d)[0x4c6a6d] /lib64/libpthread.so.0(+0x10d10)[0x7fc0e695cd10] /lib64/libc.so.6(gsignal+0x38)[0x7fc0e5900bf8] /lib64/libc.so.6(abort+0x13a)[0x7fc0e590204a] /usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x15d)[0x7fc0e600a80d] /usr/lib64/libstdc++.so.6(+0x94896)[0x7fc0e6008896] /usr/lib64/libstdc++.so.6(+0x948e1)[0x7fc0e60088e1] /usr/lib64/libstdc++.so.6(+0x94af8)[0x7fc0e6008af8] /usr/lib64/libstdc++.so.6(_ZSt19__throw_logic_errorPKc+0x3f)[0x7fc0e603000f] /usr/lib64/libstdc++.so.6(_ZNSs12_S_constructIPKcEEPcT_S3_RKSaIcESt20forward_iterator_tag+0x1f)[0x7fc0e6048f6f] /usr/lib64/libstdc++.so.6(_ZNSsC2EPKcRKSaIcE+0x36)[0x7fc0e6049356] ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100[0x4844fc] ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100[0x484b62] ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100[0x47121a] ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100[0x4619fc] ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100[0x46a345] /lib64/libc.so.6(__libc_start_main+0xf0)[0x7fc0e58ec5b0] ../../projects/setiathome.berkeley.edu/astropulse_7.08_x86_64-pc-linux-gnu__opencl_ati_100[0x40bda9] Exiting... The error is thrown in _ZNSs12_S_constructIPKcEEPcT_S3_RKSaIcESt20forward_iterator_tag, which is char* std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag)when the first argument (the begin iterator) is a null pointer. The parent function is std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)which is apparently called with a null pointer as first argument. Because there are no debug symbols for the AstroPulse executable, I cannot dig deeper. Any idea how this happened?[/url] |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
You may want to read through this thread of the last endeavours of a soul trying to use the Mesa drivers. He ran into the same error as you did. Error : Building Program (binary, clBuildProgram):main kernels: not OK code -43 Juha wrote: -43 is CL_INVALID_BUILD_OPTIONS. The compiler is probably choking on some ATI specific compiler option. Too bad it doesn't actually include the build log even though it says so. The user in that thread (Paul) has given up, as far as I can see from his returned tasks. But you could probably send him a message in PM and ask how far he got, and if the two of you can try work together getting it fixed. Edit: you may also want to stroll through this thread at the BOINC boards, with the same user and the same problems. And with Juha and me. :) |
Aaron Puchert Send message Joined: 28 Mar 08 Posts: 5 Credit: 432,715 RAC: 0 |
Thanks for the links! I wasn't aware that there were other threads for this issue already. Of course the proprietary driver and the open source driver are different, so it's no surprise that an application hand-tuned for one OpenCL platform doesn't work for another out-of-the-box. |
Aaron Puchert Send message Joined: 28 Mar 08 Posts: 5 Credit: 432,715 RAC: 0 |
I've noted that compiling the AstroPulse kernels (without any options) yields the following error: AstroPulse_Kernels_r2751.cl build error: input.cl:254:22: warning: double precision constant requires cl_khr_fp64, casting to single precision /usr/local/include/clc/float/definitions.h:50:25: note: expanded from macro 'M_PI' unsupported call to function calc_chirp in dechirp_range1_kernel error: build error The warning is a litte bit annoying, but apparently harmless. The error however is a bit mysterious. Both occurrences of calc_chirp are declared as "inline", and removing this specifications eliminates the error. At least the kernels compile now. (without options) I haven't checked yet if this really works, I'm waiting for new tasks. Are the optimized AstroPulse apps open-source, so I could have a look at how the kernels are build? I can only find the sources for the stock application. (http://setiboinc.cvs.sourceforge.net/viewvc/setiboinc/?view=tar) This would be nice if there is really a problem with the build options. |
Aaron Puchert Send message Joined: 28 Mar 08 Posts: 5 Credit: 432,715 RAC: 0 |
Never mind, I found the source code: https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt/. It's hard to look through the forest of #ifdefs, but it seems that for AMD GPUs we have an option "-fno-bin-amdil", which is not part of the standard. So Mesa doesn't understand it. This options seems to suppress the generation of some AMD-specific intermediate language code, so it could probably just be omitted. |
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
I was waiting for Paul to test the Multibeam app in anon platform mode before forwarding everything to the developers but he disappeared and I guess I kind of forgot it. Anyway, you could either blank out the bad compiler option with hex editor or you could edit the source code and re-compile. If you decide to compile the app yourself you need to first compile BOINC API and libs (from master branch). To compile Seti GPU apps checkout the entire sah_v7_opt tree. The tree is not stable so you'll want to use the exact same revision that the stock apps are built from. Use the configure line from one of the configure_*.txt files. The currently distributed Astropulse app appears to be have been compiled from AP_BLANKIT tree and Multibeam app is from AKv8 tree. |
Aaron Puchert Send message Joined: 28 Mar 08 Posts: 5 Credit: 432,715 RAC: 0 |
Thank you. I think I'll first try it with the hex editor. But I don't have any work units at the moment and it seems they are hard to get. Are there maybe dummy work units out there that can be used for testing? |
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
Lunatics have test material available. Or you can download one straight from the server. Workunit 2118565406, file. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.