Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 139 · 140 · 141 · 142 · 143 · 144 · 145 . . . 162 · Next

AuthorMessage
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2028743 - Posted: 21 Jan 2020, 12:53:57 UTC - in response to Message 2028740.  

Like Richard said, 0.9 doesnt mean it’s using that much. That value is only used for the BOINC internal book keeping so it knows how much resources are being used and how many jobs to run.

With gravity wave, I actually observed the GPU tasks using more than a full thread. About 1.2 - 1.5 CPU threads per GPU WU.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2028743 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2028744 - Posted: 21 Jan 2020, 13:04:38 UTC - in response to Message 2028743.  

It's relatively unimportant for SETI-only crunchers, and for users of BOINC v7.14.2

But be aware that there's a change coming in BOINC v7.16.n - all those 0.9 CPUs for GPU tasks will be counted up and used to calculate how much cache time you have left for CPU tasks. CPU caching for other projects may be adversely affected if you have a lot of GPU tasks lined up. Work fetch for CPU can be restored to normal by setting a lower value via app_config.xml
ID: 2028744 · Report as offensive     Reply Quote
Profile JWJennison

Send message
Joined: 8 Mar 19
Posts: 14
Credit: 54,481,483
RAC: 150
United States
Message 2028863 - Posted: 23 Jan 2020, 23:36:02 UTC

Looking for help getting Ubuntu to controll USB AIO cooler
have OpenCorsairLink seeing the device, but won't read/write from it..
any help would be appreciated.
root@james-desktop:/home/james/hidapi/OpenCorsairLink# sudo ./OpenCorsairLink.elf --device=0 --fan mode=5 --pump mode=5
Dev=0, CorsairLink Device Found: H115i Pro!

Vendor: Corsair
Product: H115i Pro
Firmware: 0.0.0.0
Temperature 0: 0.00 C
Fan 0: Mode 0x00
Current/Max Speed 0/0 RPM
Fan 1: Mode 0x00
Current/Max Speed 0/0 RPM
Pump: Mode 0x00 (AsetekProPumpQuiet)
Current/Max Speed 0/0 RPM
Setting fan to mode: 5
Setting pump to mode: 5

ID: 2028863 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2028881 - Posted: 24 Jan 2020, 0:52:44 UTC - in response to Message 2028863.  

Use the jonasmalacofilho liquidctl application instead. Much easier to install and get running. I had no luck with the OpenCorsairLink program. Probably only works on Corsair hardware. The liquidctl application works on most of the generic Asetek AIO hardware coolers.
https://github.com/jonasmalacofilho/liquidctl
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2028881 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2028882 - Posted: 24 Jan 2020, 1:06:16 UTC - in response to Message 2028740.  


On the subject of pictures - it would be helpful if you could use an image hosting service which would allow you to show screenshots at a higher resolution than 300x240 - I found those hard to read.


I am not sure what the resolution is on the original screen shots. I did manage to get some shots of E@H during the maintenance on Tuesday. Does anyone recommend a website for hosting images? I know it is an issue with Google drive, you have tickle it just right. And my other choice of a host is my free Blog site from WordPress.

Thank you Ian.





I hope I didn't screw it up again.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2028882 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2028911 - Posted: 24 Jan 2020, 2:35:50 UTC - in response to Message 2028882.  

On the subject of pictures - it would be helpful if you could use an image hosting service which would allow you to show screenshots at a higher resolution than 300x240 - I found those hard to read.

I am not sure what the resolution is on the original screen shots. I did manage to get some shots of E@H during the maintenance on Tuesday. Does anyone recommend a website for hosting images? I know it is an issue with Google drive, you have tickle it just right. And my other choice of a host is my free Blog site from WordPress.
Thank you Ian.
I hope I didn't screw it up again.
Tom


. . You could try DropBox if like.

Stephen

. .
ID: 2028911 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2028912 - Posted: 24 Jan 2020, 2:37:39 UTC - in response to Message 2028882.  
Last modified: 24 Jan 2020, 2:38:40 UTC

Yeah it’s because each CPU task is using a massive amount of memory. 750MB for each instance.

So you will have to limit the jobs as you’ve done and use the CPU less, or if you want to use more threads, you need to add more memory to the system.

For image hosting I like imgur.com best.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2028912 · Report as offensive     Reply Quote
Darrell Wilcox Project Donor
Volunteer tester

Send message
Joined: 11 Nov 99
Posts: 303
Credit: 180,954,940
RAC: 118
Vietnam
Message 2028922 - Posted: 24 Jan 2020, 4:43:03 UTC - in response to Message 2022589.  

@ Jimbocous

An update to your instructions:

I just wasted this morning trying to re-install Ubuntu and BOINC. Finally, I got it to work, but ...

If you update the Ubuntu system after installing and before following these instructions, your
instructions fail. If you DON'T update after installing, your instructions work beautifully.

It appears to this neophyte that some security update has broken something needed, since
I ONLY updated with security fixes. Perhaps a Linux guru can tell me what it is/was.

So if you are pounding your head against the wall trying to make this work, stop, re-install
Ubuntu but DO NOT UPDATE YET, and follow these instructions.
ID: 2028922 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2028932 - Posted: 24 Jan 2020, 6:35:42 UTC - in response to Message 2028922.  
Last modified: 24 Jan 2020, 6:42:22 UTC

@ Jimbocous

An update to your instructions:

I just wasted this morning trying to re-install Ubuntu and BOINC. Finally, I got it to work, but ...

If you update the Ubuntu system after installing and before following these instructions, your
instructions fail. If you DON'T update after installing, your instructions work beautifully.

It appears to this neophyte that some security update has broken something needed, since
I ONLY updated with security fixes. Perhaps a Linux guru can tell me what it is/was.

So if you are pounding your head against the wall trying to make this work, stop, re-install
Ubuntu but DO NOT UPDATE YET, and follow these instructions.

I'm a tad puzzled, in that I really wasn't trying to do a full install instruction here, if I'm understanding which message you're replying to, but rather exceprting from TBar's full instructions out of his AIO package just the parts for adding OpenCL.
Or am I missing something here? Can't tell without some quoted context.
[edit] Ah, I see, that was from a month ago. Sorry, recycled those brain cells a while back. But that's what I get for just excerpting. Ian&Steve covered it more thoroughly in the next message.
ID: 2028932 · Report as offensive     Reply Quote
Darrell Wilcox Project Donor
Volunteer tester

Send message
Joined: 11 Nov 99
Posts: 303
Credit: 180,954,940
RAC: 118
Vietnam
Message 2028943 - Posted: 24 Jan 2020, 7:47:18 UTC - in response to Message 2028932.  

@ Jimbocous AND Ian&Steve

I think your instructions AND Ian's are equivalent, and WON'T work if a new install
of Ubuntu is done AND then had the system security updates applied before doing the instructions.

Your instructions:
==========================================================================================
10) Add this to your system so you will be able to easily install the Video drivers:

|sudo add-apt-repository ppa:graphics-drivers/ppa
|sudo apt-get update

11) Install NVidia and OpenCL drivers:
|sudo apt install nvidia-driver-430 ; needs repository in #10 installed or fails
|sudo apt install ocl-icd-libopencl1 ; changed from sudo apt install libopencl1 fail
then Reboot
12) Use the "software updater" to install "additional drivers" and upgrade Linux/Ubuntu to the latest driver (probably).
==========================================================================================

Ian's

==========================================================================================
sudo apt remove boinc-client
sudo apt remove boinc-manager

first add the Ubuntu drivers PPA.
sudo add-apt-repository ppa:graphics-drivers/ppa

update repositories:
sudo apt update

install the driver:
sudo apt install nvidia-driver-440

install the openCL driver for AstroPulse tasks:
sudo apt install ocl-icd-libopencl1

Then reboot. you should be done and ready to go.
==========================================================================================

Your instructions worked about December 10 or 11, but now they don't -- UNLESS I don't apply system security updates
before using them. My deduction is that some system security update BROKE something here.

However, I had installed the BOINC Manager/client from Berkeley before December 10, but not now. I did NOT remove them
and things worked as far as BOINC was concerned. I did not get WSDD or Network Neighborhood working, which is why
I did a clean install, etc., trying to "fix" whatever I had broken in that function.

I don't know enough to further diagnose this, only enough to report it for someone more knowledgeable to go after it. If
someone wants to chase it and wants my help, PM me to avoid confusion here, and we will report any fix found.
ID: 2028943 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2028964 - Posted: 24 Jan 2020, 12:07:27 UTC - in response to Message 2028932.  

I'm a tad puzzled, in that I really wasn't trying to do a full install instruction here, if I'm understanding which message you're replying to, but rather exceprting from TBar's full instructions out of his AIO package just the parts for adding OpenCL.


. . A good place to say thanks for that help BTW. Fixed the OpenCL problem and Einstein then worked fine.

Stephen

. .
ID: 2028964 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2028977 - Posted: 24 Jan 2020, 14:16:10 UTC - in response to Message 2028943.  

HOW does it not work?

Which step does it fail and what is the error that is spit out?

Need to some more details if we’re to help.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2028977 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2028993 - Posted: 24 Jan 2020, 15:57:05 UTC

What distro and version of Linux are you trying to install? If you are installing the latest Ubuntu 19.10 version, installing drivers manually or adding the ppa is unnecessary because the Nvidia drivers are included already. Only the OpenCL parts might be missing.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2028993 · Report as offensive     Reply Quote
Darrell Wilcox Project Donor
Volunteer tester

Send message
Joined: 11 Nov 99
Posts: 303
Credit: 180,954,940
RAC: 118
Vietnam
Message 2029347 - Posted: 26 Jan 2020, 10:53:35 UTC - in response to Message 2028977.  

@ Ian&Steve C.
It doesn't work if I apply security updates immediately after installing Ubuntu 18.04.3. This is
how I have installed new O/Ses for years, but it is wrong here.

Install Ubuntu but DO NOT UPDATE before doing the steps AS WRITTEN works.

I no longer need help with this as I have it working now.

Thanks Ian.
ID: 2029347 · Report as offensive     Reply Quote
Darrell Wilcox Project Donor
Volunteer tester

Send message
Joined: 11 Nov 99
Posts: 303
Credit: 180,954,940
RAC: 118
Vietnam
Message 2029349 - Posted: 26 Jan 2020, 10:55:44 UTC - in response to Message 2028993.  

@ Keith Myers
What distro and version of Linux are you trying to install?

It is Ubuntu 18.04.3, and it does work IF T-bar's instructions are followed TO THE LETTER.

Thanks.
ID: 2029349 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2029364 - Posted: 26 Jan 2020, 13:30:19 UTC

Again. Please be more specific on exactly HOW it doesn’t work if you do updates. What exactly happens?

I’ve updated my systems and it still works fine.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2029364 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2030300 - Posted: 1 Feb 2020, 14:05:36 UTC

I was looking at my task manager on my 3950x Linux box.

And while it is not running S@H tasks right now it is showing a 2/3rds 3% cpu load to 1/3rd 2% cpu load on the World Community Grid cpu tasks I am running.

I am not certain I am seeing any differences in processing time but has gotten me to wondering. I run 90% of available cpus on BOINC. And the over-all work load reported is at around 88%-91% just like it "normally" is.

I have a raft of E@H cpu tasks parked waiting since I turned WCG back on.

Any discussion?

Tom
A proud member of the OFA (Old Farts Association).
ID: 2030300 · Report as offensive     Reply Quote
Profile Buckeye4LF Project Donor
Avatar

Send message
Joined: 19 Jun 00
Posts: 173
Credit: 54,916,209
RAC: 833
United States
Message 2030878 - Posted: 5 Feb 2020, 14:42:18 UTC

I have run on Windows for almost 20 years and I am finally setting up a dedicated Linux box. I have had a lot of help from Keith Myers from the GPU users group to get things running. I will be starting with CUDA90 but will hopefully step up to CUDA102 quickly. It is a new rig, I went all out on the processor (Threadripper 3970) and added dual RTX super 2070 cards. My RAC has been pretty sad as of late.

ID: 2030878 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2030887 - Posted: 5 Feb 2020, 16:48:18 UTC - in response to Message 2030878.  

I saw your post over on the Lunatics installer thread. Make sure to use Richard's latest installer for the Windows side of the rig. If you use the latest Nvidia 442.19 drivers, you can avoid the issue of the stalled out VHAR tasks from the Arecibo antenna. The SoG app is fastest and safe to use with the latest drivers.

For the Linux side, TBar's AIO is the easiest and best to use. Start with the default CUDA90 app. Update your drivers to the latest 440 series using the graphics-drivers ppa repository and after rebooting, edit the app_info with Find and Replace in the Text Editor to switch over to the CUDA102 application.
https://launchpad.net/~graphics-drivers/+archive/ubuntu/ppa
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2030887 · Report as offensive     Reply Quote
Profile Buckeye4LF Project Donor
Avatar

Send message
Joined: 19 Jun 00
Posts: 173
Credit: 54,916,209
RAC: 833
United States
Message 2031021 - Posted: 6 Feb 2020, 11:32:39 UTC - in response to Message 2030887.  

Linux is giving me issues. Does anyone Know what I am doing wrong? Here is a paste from the terminal window.....

greg@greg-MS-7C59:~$ boinc
06-Feb-2020 05:29:51 [---] cc_config.xml not found - using defaults
06-Feb-2020 05:29:51 [---] Starting BOINC client version 7.16.3 for x86_64-pc-linux-gnu
06-Feb-2020 05:29:51 [---] log flags: file_xfer, sched_ops, task
06-Feb-2020 05:29:51 [---] Libraries: libcurl/7.65.3 OpenSSL/1.1.1c zlib/1.2.11 libidn2/2.2.0 libpsl/0.20.2 (+libidn2/2.0.5) libssh/0.9.0/openssl/zlib nghttp2/1.39.2 librtmp/2.3
06-Feb-2020 05:29:51 [---] Data directory: /home/greg
06-Feb-2020 05:29:52 [---] CUDA: NVIDIA GPU 0: GeForce RTX 2070 SUPER (driver version 440.48, CUDA version 10.2, compute capability 7.5, 4096MB, 3968MB available, 9216 GFLOPS peak)
06-Feb-2020 05:29:52 [---] CUDA: NVIDIA GPU 1: GeForce RTX 2070 SUPER (driver version 440.48, CUDA version 10.2, compute capability 7.5, 4096MB, 3968MB available, 9216 GFLOPS peak)
06-Feb-2020 05:29:52 [---] Creating new client state file
06-Feb-2020 05:29:52 [---] [libc detection] gathered: 2.30, Ubuntu GLIBC 2.30-0ubuntu2
06-Feb-2020 05:29:52 [---] Host name: greg-MS-7C59
06-Feb-2020 05:29:52 [---] Processor: 64 AuthenticAMD AMD Ryzen Threadripper 3970X 32-Core Processor [Family 23 Model 49 Stepping 0]
06-Feb-2020 05:29:52 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca
06-Feb-2020 05:29:52 [---] OS: Linux Ubuntu: Ubuntu 19.10 [5.0.0-38-generic|libc 2.30 (Ubuntu GLIBC 2.30-0ubuntu2)]
06-Feb-2020 05:29:52 [---] Memory: 125.85 GB physical, 2.00 GB virtual
06-Feb-2020 05:29:52 [---] Disk: 915.40 GB total, 857.67 GB free
06-Feb-2020 05:29:52 [---] Local time is UTC -6 hours
06-Feb-2020 05:29:52 [---] Last benchmark was 18298 days 11:29:51 ago
06-Feb-2020 05:29:52 [---] No general preferences found - using defaults
06-Feb-2020 05:29:52 [---] Preferences:
06-Feb-2020 05:29:52 [---] max memory usage when active: 64433.68 MB
06-Feb-2020 05:29:52 [---] max memory usage when idle: 115980.63 MB
06-Feb-2020 05:29:52 [---] max disk usage: 823.86 GB
06-Feb-2020 05:29:52 [---] don't use GPU while active
06-Feb-2020 05:29:52 [---] suspend work if non-BOINC CPU load exceeds 25%
06-Feb-2020 05:29:52 [---] (to change preferences, visit a project web site or select Preferences in the Manager)
06-Feb-2020 05:29:52 [---] Setting up project and slot directories
dir_open: Could not open directory 'slots' from '/home/greg'.
06-Feb-2020 05:29:52 [---] Checking active tasks
06-Feb-2020 05:29:52 [---] Setting up GUI RPC socket
I 06-Feb-2020 05:30:21 [---] GUI RPC bind to port 31416 failed: 98
06-Feb-2020 05:30:22 gstate.init() failed
Error Code: -180

ID: 2031021 · Report as offensive     Reply Quote
Previous · 1 . . . 139 · 140 · 141 · 142 · 143 · 144 · 145 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.