Message boards :
Number crunching :
No Usable GPU in Linux
Message board moderation
Previous · 1 · 2 · 3
Author | Message |
---|---|
David Anderson (not *that* DA) Send message Joined: 5 Dec 09 Posts: 215 Credit: 74,008,558 RAC: 74 |
I've modified app_info.xml on the dual-760's machine to read along the lines you suggested, but naming what apps and shared libraries I have right now. I have no idea if the result is correct or useful. <app_info> <app> <name>setiathome_v7</name> </app> <file_info> <name>MBv7_7.05r2549_sse42_linux64</name> <executable/> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>705</version_num> <cmdline></cmdline> <file_ref> <file_name>MBv7_7.05r2549_sse42_linux64</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>ap_7.05r2728_sse3_linux64</name> <executable/> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>x86_64-pc-linux-gnu</platform> <plan_class></plan_class> <cmdline></cmdline> <file_ref> <file_name>ap_7.05r2728_sse3_linux64</file_name> <main_program/> </file_ref> </app_version> <file_info> <name>setiathome_x41g_x86_64-pc-linux-gnu_cuda32</name> <executable/> </file_info> <file_info> <name>libcudart.so.3</name> <executable/> </file_info> <file_info> <name>libcufft.so.3</name> <executable/> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>705</version_num> <plan_class>cuda32</plan_class> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>setiathome_x41g_x86_64-pc-linux-gnu_cuda32</file_name> <main_program/> </file_ref> <file_ref> <file_name>libcudart.so.3</file_name> </file_ref> <file_ref> <file_name>libcufft.so.3</file_name> </file_ref> </app_version> </app_info> |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I've modified app_info.xml on the dual-760's machine I think the problem is you have the AstroPulse App in between the Setiathome Apps. I like to make complete sections, it's less confusing to me. If you moved the AP App to below the Setiathome App it would probably work. The Newer CUDA 60 App will probably work better on your 760s. In that case you would just Remove the CUDA 32 section and Replace it with the section I posted. Of course you would need to add the three CUDA 60 files to the setiathome.berkeley.edu folder with the app_info.xml file. The 9500 would work better with the CUDA 32 App. <app_info> <app> <name>setiathome_v7</name> </app> <file_info> <name>MBv7_7.05r2549_sse42_linux64</name> <executable/> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>705</version_num> <file_ref> <file_name>MBv7_7.05r2549_sse42_linux64</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>ap_7.05r2728_sse3_linux64</name> <executable/> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>x86_64-pc-linux-gnu</platform> <file_ref> <file_name>ap_7.05r2728_sse3_linux64</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_v7</name> </app> <file_info> <name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda60</name> <executable/> </file_info> <file_info> <name>libcudart.so.6.0</name> <executable/> </file_info> <file_info> <name>libcufft.so.6.0</name> <executable/> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>704</version_num> <plan_class>cuda60</plan_class> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda60</file_name> <main_program/> </file_ref> <file_ref> <file_name>libcudart.so.6.0</file_name> </file_ref> <file_ref> <file_name>libcufft.so.6.0</file_name> </file_ref> </app_version> </app_info> |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
It would be Nice if the mbcuda.cfg settings worked in Linux...and OSX. Requests noted, will have to work out how Linux renicing works, and put generic code in place for parsing the cfg file. ;) On the original issue, my Linux dev machine is ubuntu 14.04 LTS and last I checked up to date. Every time a kernel update or driver install occurred, the X display would break and go to black screen on boot. (usual gymnastics of entering a text console and reinstalling the display driver would routinely follow) Recently I worked out the reason for the driver pre install script failure message was that Ubuntu's dkms kernel management wasn't installed (odd), so have installed that about a week ago, and am hoping both kernel and display driver updates go more smoothly next time. sudo apt-get install dkms "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Some recent experiences that might help. It has become clear there is a problem when using the nVidia driver .run package to uninstall the driver. The best I can tell is, if you don't install another driver before rebooting you will probably be stuck in the Login Loop from Hell. Some posts about it, http://askubuntu.com/search?q=nvidia+login+loop. I've tried just about every suggestion at Ask.Ubuntu and the only solution I've found is to drop into the console and install another driver. There is also problems with this as some cards will not even display the console in this situation. The only solution to that is to remove the card and use a different card until you can get a driver installed. This reminds me Very much of the problem I was having in Windows 8/8.1 when having the nVidia driver do a Clean install. Basically the System went into lockdown as soon as the old driver was removed. Some say the Ubuntu problem is caused by not writing the Linux Headers back the way they should be written, all I know is it's nasty. During a recent 'bout I found it was possible to have a situation where there were both a repository and a proprietary driver installed at the same time. According to the proprietary driver, it was 'backing-up' the repository driver while installing but the repository driver was left installed. In my case I had files for libnvidia-opencl.so.304 & libnvidia-opencl.so.343 installed side by side. BOINC refused to see OpenCL until one was removed, in fact, both were removed as there is a problem with the 343 clBuild program, ie, it doesn't work. Anyway, for those that have been trying all sorts of remedies trying to get BOINC to see OpenCL, you might want to try a Clean install. It is possible to FUBAR a system to where it is not worth saving. I did get this system going again, but I've been stuck in the Login Loop before and just reinstalled everything, even the Home folder where the Login Loop appears to originate. Have Fun. |
Baiteh Send message Joined: 10 Sep 15 Posts: 34 Credit: 7,705,483 RAC: 0 |
Installed the toolkit - blimey! Now I get how you guy's get the RAC with 3 or 4 980Ti rigs, lol! |
David Anderson (not *that* DA) Send message Joined: 5 Dec 09 Posts: 215 Credit: 74,008,558 RAC: 74 |
Was running with nvidia driver installed by sgfxi. A 14.04 kernel update resulted in absurd window/text sizes (very very large) Apparently noveau driver? Uh Oh. Did: cntrl-alt F2 (to stop X) (login to text window) sudo su - service lightdm stop sudo apt-get remove `dpkg-query --show '*nvidia*'` apt-get remove dkms sgfxi (and now back to 355 nvidia driver, and on a reboot to test that this survives reboot, all is well with boinc on restarting boinc-client). Still not really getting Seti GPU tasks much if at all, but oh well. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
For the Host with the two 760's you could just rename the app_info.xml to app_info1.xml and see if it will receive work running as Stock. It says it has OpenCL and you don't have any existing tasks. If the one with the 750 is the one that threw the Error with the CUDA Toolkit I'd reformat and install a fresh copy. It Still doesn't list OpenCL and tracking down the problem would take longer than just reinstalling the OS and installing the Toolkit. If you want to try it with just CUDA I'll post my app_info with your CPU App listed, I'll leave the GPU AstroPulse App listed just in case you get OpenCL working. If you don't get OpenCL working it would be best to remove the GPU AstroPulse section or you will get Errors without OpenCL. The AP App is here; http://boinc2.ssl.berkeley.edu/beta/download/astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100 http://boinc2.ssl.berkeley.edu/beta/download/AstroPulse_Kernels_r2751.cl http://boinc2.ssl.berkeley.edu/beta/download/ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt <app_info> <app> <name>astropulse_v7</name> </app> <file_info> <name>astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100</name> <executable/> </file_info> <file_info> <name>ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>708</version_num> <plan_class>opencl_nvidia_linux</plan_class> <coproc> <type>NVIDIA</type> <count>1</count> </coproc> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <file_ref> <file_name>astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100</file_name> <main_program/> </file_ref> <file_ref> <file_name>ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v7</name> </app> <file_info> <name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda60</name> <executable/> </file_info> <file_info> <name>libcudart.so.6.0</name> <executable/> </file_info> <file_info> <name>libcufft.so.6.0</name> <executable/> </file_info> <file_info> <name>mbcuda.cfg</name> </file_info> <app_version> <app_name>setiathome_v7</app_name> <version_num>704</version_num> <plan_class>cuda60</plan_class> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <coproc> <type>CUDA</type> <count>1</count> </coproc> <file_ref> <file_name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda60</file_name> <main_program/> </file_ref> <file_ref> <file_name>libcudart.so.6.0</file_name> </file_ref> <file_ref> <file_name>libcufft.so.6.0</file_name> </file_ref> <file_ref> <file_name>mbcuda.cfg</file_name> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>ap_7.05r2728_sse3_linux64</name> <executable/> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>705</version_num> <platform>x86_64-pc-linux-gnu</platform> <file_ref> <file_name>ap_7.05r2728_sse3_linux64</file_name> <main_program/> </file_ref> </app_version> <app> <name>setiathome_v7</name> </app> <file_info> <name>MBv7_7.05r2549_sse42_linux64</name> <executable/> </file_info> <app_version> <app_name>setiathome_v7</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>705</version_num> <file_ref> <file_name>MBv7_7.05r2549_sse42_linux64</file_name> <main_program/> </file_ref> </app_version> </app_info> That's the same app_info.xml I'm running. I stole the mbcuda.cfg file from Windows, it doesn't work in Linux though. Hmmm, Now the 750 says it has OpenCL, http://setiathome.berkeley.edu/show_host_detail.php?hostid=7748035 So, how did you get it working? |
David Anderson (not *that* DA) Send message Joined: 5 Dec 09 Posts: 215 Credit: 74,008,558 RAC: 74 |
Been focusing on the host with one 750. Event log shows Seti does not see the GPU or reqeust GPU tasks though Einstein and boinc do see the GPU. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.