High performance Linux clients at SETI

Author	Message
Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1990404 - Posted: 17 Apr 2019, 16:33:45 UTC - in response to Message 1990361. I'm assuming that anyone running a special build knows their way round the system: find the stderr.txt file in the slot directory where this particular task is running, and see if that contains any clues. Once you've done that investigation, try suspending that particular task and allowing it to run again once it's cleared itself out of memory - that sometimes kicks 'em into life. . . That would be good advice but sadly too late in this case ... I killed it, but want to document to the group in case it was useful: Stephen :( ID: 1990404 ·

Siran d'Vel'nahr Volunteer tester Send message Joined: 23 May 99 Posts: 7379 Credit: 44,181,323 RAC: 238	Message 1990407 - Posted: 17 Apr 2019, 16:57:24 UTC - in response to Message 1990401. Last modified: 17 Apr 2019, 17:06:52 UTC Actually I intended to quote the old post with instructions and the link in it. Most others were able to figure it out. I also mentioned the ReadMe files which also state how to change to the CUDA 10.1 App; From the ReadMe in setiathome.berkeley.edu/docs 7) If you have an AMD CPU move the AMD CPU App in the folder 'For AMD CPUs' to the root level, change the App names in the CPU section of the app_info.xml (<name> & <file_name>), and see if that works better. If you have a CUDA 10.1 driver you can use the CUDA101 App, change the app_info.xml to name the CUDA 10.1 App in the Two locations, <name> & <file_name> It's the Same in Every version of BOINC on Every Platform, absolutely Nothing different in Windows Linux, or Mac. Windows = http://mikesworld.eu/download.html Linux = http://lunatics.kwsn.info/index.php?action=downloads;cat=48 Mac = https://arkayn.us/forum/index.php?PHPSESSID=1f59a52c29828c5235c1d51133e07d30&topic=191.0 What's interesting is someone that has been a member of SETI for so long and still doesn't know how to load an App.... Hi TBar, My Linux box is barely 3 weeks into running SETI. Just how much do you assume I should learn in that short amount of time? At my age my memory is not good so I forget what I have for breakfast anymore. ;) Not that my memory was ever any good to begin with. My previous incursions into trying Linux was with the stock apps. Set and forget. I really didn't start paying much attention to the "special apps" thing until I decided to build the Linux box. So, please, watch how you speak about users that have been around for a long time on SETI. Ok? :) You know, I even tried running SETI on FreeBSD. Talk about a mass of confusion! ;) No GUI at all. I got it running some how... :) Have a great day! :) Siran [edit] The "most users" you are referring to are most likely Keith and many others that have been doing Linux for a far greater time than I. So yeah, they would figure it out. Noobies like me will take much longer to figure things out when it comes to Linux. I keep forgetting about "readme files" and using them for instructions. It is something I will need to get used to. What you quoted about AMD users would not apply to me since I am not an AMD user. ;) [/edit] CAPT Siran d'Vel'nahr - L L & P _\\// Winders 11 OS? "What a piece of junk!" - L. Skywalker "Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath ID: 1990407 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990417 - Posted: 17 Apr 2019, 19:29:06 UTC - in response to Message 1990339. Last modified: 17 Apr 2019, 19:38:46 UTC Not quite sure how you get Task Manager or the proper name of System Monitor to show 100% usage. Unless you are running all 16 threads on cpu and gpu tasks. Can't you just use a <project_max_concurrent> statement and knock a few threads out of use to limit your usage to 75-80%? Here is my app_config.xml <app_config> <project_max_concurrent>52</project_max_concurrent> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1.0</gpu_usage> <cpu_usage>0.33</cpu_usage> </gpu_versions> </app> <app> <name>astropulse_v7</name> <gpu_versions> <gpu_usage>0.50</gpu_usage> <cpu_usage>2.0</cpu_usage> </gpu_versions> </app> </app_config> For those of you who are confused by the current project_max_concurrent # that is 6 gpus + 32 spoofed gpus + # of cpu threads that are not being used fulltime by gpus. 6+32+14 = My app_info.xml <app_info> <app> <name>setiathome_v8</name> </app> <file_info> <name>setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101</name> <executable/> </file_info> <app_version> <app_name>setiathome_v8</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>801</version_num> <plan_class>cuda90</plan_class> <cmdline></cmdline> <coproc> <type>NVIDIA</type> <count>1</count> </coproc> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <file_ref> <file_name>setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100</name> <executable/> </file_info> <file_info> <name>AstroPulse_Kernels_r2751.cl</name> </file_info> <file_info> <name>ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt</name> </file_info> <app_version> <app_name>astropulse_v7</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>708</version_num> <plan_class>opencl_nvidia_100</plan_class> <coproc> <type>NVIDIA</type> <count>1</count> </coproc> <avg_ncpus>0.1</avg_ncpus> <max_ncpus>0.1</max_ncpus> <file_ref> <file_name>astropulse_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100</file_name> <main_program/> </file_ref> <file_ref> <file_name>AstroPulse_Kernels_r2751.cl</file_name> </file_ref> <file_ref> <file_name>ap_cmdline_7.08_x86_64-pc-linux-gnu__opencl_nvidia_100.txt</file_name> <open_name>ap_cmdline.txt</open_name> </file_ref> </app_version> <app> <name>setiathome_v8</name> </app> <file_info> <name>MBv8_8.22r3711_sse41_intel_x86_64-pc-linux-gnu</name> <executable/> </file_info> <app_version> <app_name>setiathome_v8</app_name> <platform>x86_64-pc-linux-gnu</platform> <version_num>800</version_num> <file_ref> <file_name>MBv8_8.22r3711_sse41_intel_x86_64-pc-linux-gnu</file_name> <main_program/> </file_ref> </app_version> <app> <name>astropulse_v7</name> </app> <file_info> <name>ap_7.05r2728_sse3_linux64</name> <executable/> </file_info> <app_version> <app_name>astropulse_v7</app_name> <version_num>704</version_num> <platform>x86_64-pc-linux-gnu</platform> <plan_class></plan_class> <file_ref> <file_name>ap_7.05r2728_sse3_linux64</file_name> <main_program/> </file_ref> </app_version> </app_info> My local Boinc Manager is set to 90% of available cpu cores. If I add -nobs to the command line the Linux Task Manager pegs to 100% even though the % of cpus is 90. If I run the gpus and cpus on a 1 to 1 ratio then -nobs behaves as I am expecting and the Task Manager continues to stay around 90%. But I lose the cpu calculations from 6 cpu threads that share 5% of the normal 6% processing with the 1% cpu processing to service the gpus. If I drop the % of cpu below 33% I have trouble with the cpu tasks stalling and generating timeouts which cause a report of task error. What am I overlooking or screwing up? Tom A proud member of the OFA (Old Farts Association). ID: 1990417 ·

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 1990418 - Posted: 17 Apr 2019, 19:33:26 UTC - in response to Message 1990417. For those of you who are confused by the current project_max_concurrent # that is 6 gpus + 32 spoofed gpus + # of cpu threads that are not being used fulltime by gpus. 6+32+14 = pretty sure you're doing that wrong. it's #spoofed GPU + CPU threads. so you should be using 32+14 only. so 46. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 1990418 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990419 - Posted: 17 Apr 2019, 19:39:33 UTC - in response to Message 1990418. Last modified: 17 Apr 2019, 19:40:52 UTC For those of you who are confused by the current project_max_concurrent # that is 6 gpus + 32 spoofed gpus + # of cpu threads that are not being used fulltime by gpus. 6+32+14 = pretty sure you're doing that wrong. it's #spoofed GPU + CPU threads. so you should be using 32+14 only. so 46. Ok, I will make that change. That means my other box is running with too high a project_max too. Tom A proud member of the OFA (Old Farts Association). ID: 1990419 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990421 - Posted: 17 Apr 2019, 19:46:42 UTC - in response to Message 1990358. Please don't add the 750 Ti's to that rig Tom as I'll be taking a benchmark from it. After I get my old wagon done for another 12mths on the road (by the end of this month) I'm getting a couple of SSD's to dual boot these 2 rigs of mine.to get a bit more heat out of them this quickly coming up winter here. ;-) Cheers. I will finish the troubleshooting and wait a bit before I install the 750Ti for production type testing. Part of the question is will this MB run 7 gpus for production or not? It clearly will run 6. And the gui fails with some kind of bus error when I run 8 gpus. But maybe it will run 7. Tom A proud member of the OFA (Old Farts Association). ID: 1990421 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1990425 - Posted: 17 Apr 2019, 20:41:46 UTC Happy that Ian jumped in and straightened you out on the use of <project_max_concurrent>N</project_max_concurrent> parameter value with the spoofed gpu client. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1990425 ·

dallasdawg Send message Joined: 19 Aug 99 Posts: 49 Credit: 142,692,438 RAC: 2	Message 1990445 - Posted: 18 Apr 2019, 1:00:49 UTC Last modified: 18 Apr 2019, 1:23:32 UTC Petri and Tbar, thank you for your tireless efforts. ID: 1990445 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1990456 - Posted: 18 Apr 2019, 3:20:48 UTC - in response to Message 1990418. pretty sure you're doing that wrong. it's #spoofed GPU + CPU threads. so you should be using 32+14 only. so 46. . . The question that raises in my mind is, how does one go about spoofing extra GPUs ?? Stephen ? ? ID: 1990456 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990457 - Posted: 18 Apr 2019, 3:31:57 UTC I don't think I will be installing this upgrade on my other box. And I will probably be downgrading this. 1) I am throwing a lot of "validation inconclusive" 2) And a lot of "computation errors". I have switched to 1 cpu to 1 gpu on the off chance that #2 is being caused by cpus cores being over worked. However, I appear to still be getting "validation inconclusive". This is running the CUDA 10.1 version with the latest 4.18 Nvidia drivers via Launchpad. I expect I will also try it with a lower video driver version as well as latest CUDA 9.1 app. But it is clear that something isn't consistent with the results everyone was getting with the higher model Nvidia cards. But either I have something else setup wrong, or my MB doesn't agree with the latest software changes. Or something I haven't thought of. I will leave my latest changes overnight and see if either of the above changes behavior. Tom A proud member of the OFA (Old Farts Association). ID: 1990457 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990458 - Posted: 18 Apr 2019, 3:33:18 UTC - in response to Message 1990456. pretty sure you're doing that wrong. it's #spoofed GPU + CPU threads. so you should be using 32+14 only. so 46. . . The question that raises in my mind is, how does one go about spoofing extra GPUs ?? Stephen ? ? You change the source code in BOINC.exe Tom A proud member of the OFA (Old Farts Association). ID: 1990458 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1990459 - Posted: 18 Apr 2019, 3:33:24 UTC - in response to Message 1990421. Please don't add the 750 Ti's to that rig Tom as I'll be taking a benchmark from it. After I get my old wagon done for another 12mths on the road (by the end of this month) I'm getting a couple of SSD's to dual boot these 2 rigs of mine.to get a bit more heat out of them this quickly coming up winter here. ;-) Cheers. I will finish the troubleshooting and wait a bit before I install the 750Ti for production type testing. Part of the question is will this MB run 7 gpus for production or not? It clearly will run 6. And the gui fails with some kind of bus error when I run 8 gpus. But maybe it will run 7. Tom Hi, One is to test. Backup everything, go off-line during the test. I can say it runs with 4. TBar is running 7 or 11 GPUs on his rig so he might be able to say a definite yes or no. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1990459 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1990460 - Posted: 18 Apr 2019, 3:51:03 UTC - in response to Message 1990445. Petri and Tbar, thank you for your tireless efforts. +1 Stephen . . ID: 1990460 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1990462 - Posted: 18 Apr 2019, 3:52:57 UTC - in response to Message 1990458. . . The question that raises in my mind is, how does one go about spoofing extra GPUs ?? Stephen You change the source code in BOINC.exe Tom . . OK, that sounds like it's above my pay grade and slightly dangerous ... :( Stephen . . ID: 1990462 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990496 - Posted: 18 Apr 2019, 13:17:44 UTC - in response to Message 1990457. I don't think I will be installing this upgrade on my other box. And I will probably be downgrading this. 1) I am throwing a lot of "validation inconclusive" 2) And a lot of "computation errors". I have switched to 1 cpu to 1 gpu on the off chance that #2 is being caused by cpus cores being over worked. However, I appear to still be getting "validation inconclusive". This is running the CUDA 10.1 version with the latest 4.18 Nvidia drivers via Launchpad. It hasn't thrown either types of errors since I switched to 1 to 1. (Shrug) Tom A proud member of the OFA (Old Farts Association). ID: 1990496 ·

W3Perl Volunteer tester Send message Joined: 29 Apr 99 Posts: 251 Credit: 3,696,783,867 RAC: 12,606	Message 1990504 - Posted: 18 Apr 2019, 13:43:24 UTC New petri binary is now running here :) I've made some tests. Speed improvements are : (blc32 and arecibo wu) 10-15% for GTX 750 Ti 6-10 % for GTX 1050 Ti 13-16% for GTX 1080 14-17% for GTX 1070 I use the following command line option '-nobs -pfb 8' (16 for 1070 and 32 for 1080) Without the -pfb, speed increase was less than 10%. Previously, I had two different binaries : one for Pacal and the other for Maxwell. It may explain why I got only 10-15% improvements. Others users have posted values as high as 20-25%. Average values : 750 TI => blc : 344 sec , arecibo : 467 sec 1050 Ti => blc : 262 sec, arecibo : 344 sec 1070 => blc : 95 sec, arecibo : 244 sec 1080 => blc : 79 sec, arecibo : 108 sec Thanks again for such a nice piece of software :) ID: 1990504 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1990505 - Posted: 18 Apr 2019, 13:49:45 UTC - in response to Message 1990504. Last modified: 18 Apr 2019, 13:52:34 UTC New petri binary is now running here :) I've made some tests. Speed improvements are : (blc32 and arecibo wu) 10-15% for GTX 750 Ti 6-10 % for GTX 1050 Ti 13-16% for GTX 1080 14-17% for GTX 1070 I use the following command line option '-nobs -pfb 8' (16 for 1070 and 32 for 1080) Without the -pfb, speed increase was less than 10%. Thank you. I previously was not running -nobs on my test system because I was trying to maximize my cpu thread count. Since apparently I can't do that I will see if the above command line will work on a gtx 1060 3GB. Trying "-nobs -pfb 6" Tom A proud member of the OFA (Old Farts Association). ID: 1990505 ·

W3Perl Volunteer tester Send message Joined: 29 Apr 99 Posts: 251 Credit: 3,696,783,867 RAC: 12,606	Message 1990510 - Posted: 18 Apr 2019, 14:02:57 UTC - in response to Message 1990505. Thank you. I previously was not running -nobs on my test system because I was trying to maximize my cpu thread count. Since apparently I can't do that I will see if the above command line will work on a gtx 1060 3GB. Trying "-nobs -pfb 6" Tom Try rather 8 or 16 (1060 have 1280 cores => 1280/128 = 10) ID: 1990510 ·

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 1990511 - Posted: 18 Apr 2019, 14:06:42 UTC - in response to Message 1990504. Last modified: 18 Apr 2019, 14:16:15 UTC I saw almost 30% improvement on my 1080tis and RTX 2070s. But I didnâ€™t see as much improvement on the lesser cards. Closer to 10-15%. Not sure why the faster cards got such a massive boost over the others, but who am I to complain lol. Also, not sure what nvidia did with the 415+ drivers, but performance is definitely slower. This makes the cuda 10.0 faster than cuda 10.1 because you can stay with the 410 drivers. But I think those with RTX 2060, GTX 1660/ti, are stuck with needing 415+ drivers for support. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 1990511 ·

W3Perl Volunteer tester Send message Joined: 29 Apr 99 Posts: 251 Credit: 3,696,783,867 RAC: 12,606	Message 1990515 - Posted: 18 Apr 2019, 14:33:28 UTC - in response to Message 1990511. Last modified: 18 Apr 2019, 14:34:59 UTC I saw almost 30% improvement on my 1080tis and RTX 2070s. But I didnâ€™t see as much improvement on the lesser cards. Closer to 10-15%. Another argument to buy a GTX 20X0 ;) Not sure why the faster cards got such a massive boost over the others, but who am I to complain lol. Newest cards should have better code optimization ? Petri have only high end cards so it's more easy to improve code when you can test it on your own card ;) ID: 1990515 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.