Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 154 · 155 · 156 · 157 · 158 · 159 · 160 . . . 162 · Next

AuthorMessage
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2032694 - Posted: 16 Feb 2020, 17:14:24 UTC - in response to Message 2032693.  

I edited my post. I never had api version in the app_info either.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2032694 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2032698 - Posted: 16 Feb 2020, 17:31:41 UTC - in response to Message 2032694.  

I edited my post. I never had api version in the app_info either.
And I've edited mine. It probably doesn't matter for personal use, if you always keep the build API pretty close to the version of the BOINC client you run - but worth knowing about if you build for wider distribution.
ID: 2032698 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2032699 - Posted: 16 Feb 2020, 18:19:27 UTC - in response to Message 2032698.  
Last modified: 16 Feb 2020, 19:19:02 UTC

If you can convince me Jason was wrong for wanting to stay around 6.x, and Petri is wrong for continuing to use 7.5, I might change it. I did try 7.11 for a while, but then decided it would be best to use what Petri is using.
If you check, Jimbocous is running 7 GPUs on an older system probably not designed to run that many GPUs. On the Systems not designed for it, you hit a wall around 7-8 GPUs, especially if you are using a PCIe switch. Ask Tom M about that. If you want to run that many GPUs you need a board designed for it, either an Expensive Server board or a relatively cheap Mining board. For the price, the Mining board works for me.

BTW, ASUS is Still updating their Mining board, look what the Updates have mentioned;
2019/10/21 5.64 MBytes
B250 MINING EXPERT BIOS 1208
Improved system compatibility

2019/03/15 5.64 MBytes
B250 MINING EXPERT BIOS 1207
Improve system security and stability

2018/07/20 5.64 MBytes
B250 MINING EXPERT BIOS 1206
1.Improve system compatibility and stability

2018/06/01 5.64 MBytes
B250 MINING EXPERT BIOS 1205
Intel New ME Update, Improve Stability.

2018/04/12 5.68 MBytes
B250 MINING EXPERT BIOS 1010
1. Improve memory compatibility.

2018/03/23 5.68 MBytes
B250 MINING EXPERT BIOS 1006
1. Update CPU Microcode
2. Improve system security and stability

It keeps going, this was the second one they released;
2017/10/09 5.67 MBytes
B250 MINING EXPERT BIOS 0401
Improve system stability
ID: 2032699 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2032700 - Posted: 16 Feb 2020, 18:44:56 UTC - in response to Message 2032699.  
Last modified: 16 Feb 2020, 18:49:09 UTC

some people just have an "if it ain't broke, dont fix it" mentality. works fine for a while, until the world moves on around you and your outdated platform starts to break. Notice all the 6.x guys recently suddenly asking why they can't communicate with the project due to outdated security certificates. needing to jump through hoops just to keep it going, some not even able and being forced to upgrade. you yourself have even commented on petri's idiosyncrasies. nothing wrong with it, and petri is more than capable of handing any issues that might come up. I fall into this sometimes since my systems are so stable, i usually don't do any unnecessary updates, which keeps me from having a lot of the issues like systems breaking from a driver update or something similar. but I do it within reason, and update after the seas are calm. same reason i've avoided the Ubuntu short term releases, but I will move to 20.04 LTS when it's proven stable.

my 7x 2070 system on the z270 platform is very similar to your 14-gpu B250 mining system in the way they work and components used. total system cost is likely within a few hundred dollars. doing more work with half as many cards, less power draw, and less headaches. I'll take that any day.

I may update that system soon since I just replaced my z370 based test bench with my backup board for the watercooled system. I could take the z370 board, drop in a relatively cheap 12-thread CPU, and add another GPU or 2 to the system. still undecided.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2032700 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2032701 - Posted: 16 Feb 2020, 18:50:59 UTC - in response to Message 2032699.  

I think Jason was hoping to persuade BOINC to leave runtime estimation as a client function, through submitting code (which I tested for a while in a working client) to maintain separate DCF values for each application version. I would agree with him, because the current runtime estimation server-side code suffers badly from boundary-condition errors (new installations and new application versions) and from slow response times.

But that was 10 years ago, and the world has moved on - we are stuck, for better or worse, with CreditNew and the associated runtime estimation model.

I would suggest that it's better to stay in touch with the newer BOINC developments: after all, the SETI (science) applications have no productive purpose outside the BOINC framework, and in my view it's better to work with BOINC than to keep re-running ten-year-old battles against it.
ID: 2032701 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2032703 - Posted: 16 Feb 2020, 19:10:30 UTC - in response to Message 2032701.  

I think Jason was hoping to persuade BOINC to leave runtime estimation as a client function, through submitting code (which I tested for a while in a working client) to maintain separate DCF values for each application version. I would agree with him, because the current runtime estimation server-side code suffers badly from boundary-condition errors (new installations and new application versions) and from slow response times.

But that was 10 years ago, and the world has moved on - we are stuck, for better or worse, with CreditNew and the associated runtime estimation model.

I would suggest that it's better to stay in touch with the newer BOINC developments: after all, the SETI (science) applications have no productive purpose outside the BOINC framework, and in my view it's better to work with BOINC than to keep re-running ten-year-old battles against it.

100% agree
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2032703 · Report as offensive     Reply Quote
Phud Redux

Send message
Joined: 20 Apr 16
Posts: 270
Credit: 2,976,272
RAC: 1
United States
Message 2032711 - Posted: 16 Feb 2020, 20:46:39 UTC

Could one of you fine gentlemen please post the links to
Linux apps?

Please and many thanks!
ID: 2032711 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2032713 - Posted: 16 Feb 2020, 21:04:18 UTC

. . HEY GUYS!!!

. . This discussion has changed this thread to a Developer's forum which was never its intention. Some poor Linux newbie who wanders into this will be "WTF!" It was meant to provided user level support so people could migrate to a more productive platform ... So please remember the KISS principle ...

Stephen

< shrug >
ID: 2032713 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2032714 - Posted: 16 Feb 2020, 21:06:40 UTC - in response to Message 2032711.  
Last modified: 16 Feb 2020, 21:09:08 UTC

Could one of you fine gentlemen please post the links to
Linux apps?

Please and many thanks!


First I would recommend that you upgrade to the 440.xx drivers, so that you can use the CUDA 10.2 builds. it'll be a benefit for your Pascal and Turing based cards.

after that, you can get Tbar's V0.98 builds here: http://www.arkayn.us/lunatics/BOINC.7z
this is a whole "All-in-One" BOINC package, you do not need the repository install, and should remove it before using this.

if you want to try out using a newer V0.99 mutex enabled build (pre-loading a second WU, eliminates downtime between WUs) you can use my build here: https://setiathome.berkeley.edu/forum_thread.php?id=84933 but I would recommend waiting a few hours. I just made several new builds that should be faster and give people more flexibility on supported hardware, running them through my test suite now.

I would download Tbar's package there first and get up and running with it. then after I gather my test results you can decide if you want to modify it with a different app.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2032714 · Report as offensive     Reply Quote
Phud Redux

Send message
Joined: 20 Apr 16
Posts: 270
Credit: 2,976,272
RAC: 1
United States
Message 2032715 - Posted: 16 Feb 2020, 21:18:34 UTC - in response to Message 2032714.  

Thanks.
ID: 2032715 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2032751 - Posted: 17 Feb 2020, 2:40:55 UTC

Ha, found another one, https://setiathome.berkeley.edu/workunit.php?wuid=3886727662

   Device 1: GeForce GTX 750 Ti is okay
SETI@home using CUDA accelerated device GeForce GTX 750 Ti
Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1

setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special
Modifications done by petri33, compiled by TBar

Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.011773
Sigma 62
Sigma > GaussTOffsetStop: 62 > 2
Thread call stack limit is: 1k
Pulse: peak=10.53272, time=45.86, period=30.11, d_freq=7696464951.88, score=1.057, chirp=-2.8368, fft_len=1024 
setiathome_CUDA: Found 1 CUDA device(s):
  Device 1: GeForce GTX 750 Ti, 1999 MiB, regsPerBlock 65536
     computeCap 5.0, multiProcs 5 
     pciBusID = 1, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 750 Ti is okay
SETI@home using CUDA accelerated device GeForce GTX 750 Ti
Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1

setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special
Modifications done by petri33, compiled by TBar

Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.011773
Sigma 62
Sigma > GaussTOffsetStop: 62 > 2
Thread call stack limit is: 1k
Triplet: peak=11.00201, time=29.44, period=28.58, d_freq=7696462309.32, chirp=14.391, fft_len=256 
Spike: peak=24.16754, time=42.95, d_freq=7696464873.91, chirp=18.156, fft_len=64k
Triplet: peak=11.72892, time=59.15, period=6.845, d_freq=7696464753.79, chirp=70.223, fft_len=1024 
Triplet: peak=10.59451, time=55.71, period=16.65, d_freq=7696462779.31, chirp=-96.307, fft_len=128 

Best spike: peak=24.16754, time=42.95, d_freq=7696464873.91, chirp=18.156, fft_len=64k
Best autocorr: peak=17.2569, time=74.45, delay=3.2613, d_freq=7696461019.14, chirp=3.9728, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 
Best triplet: peak=11.72892, time=59.15, period=6.845, d_freq=7696464753.79, chirp=70.223, fft_len=1024 

Spike count:    1
Autocorr count: 0
Pulse count:    0
Triplet count:  3
Gaussian count: 0
The correct result;
Best pulse: peak=2.095706, time=45.86, period=3.081, d_freq=7696462738.7, score=1.068, chirp=-50.368, fft_len=1024 
Spike count:    1
Autocorr count: 0
Pulse count:    19
Triplet count:  3
Gaussian count: 0
It found a pulse before the reboot, then Missed them All afterwards. Same as on a Mac.
ID: 2032751 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2032754 - Posted: 17 Feb 2020, 2:50:22 UTC - in response to Message 2032751.  

let me know when you find one using an app that I compiled. would be interesting to see if it happens on mine too. so far they've all been on yours if i'm not mistaken. but then again, theres a lot more people running yours so i understand the statistics surrounding that.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2032754 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2032763 - Posted: 17 Feb 2020, 4:44:10 UTC - in response to Message 2032699.  

If you check, Jimbocous is running 7 GPUs on an older system probably not designed to run that many GPUs. On the Systems not designed for it, you hit a wall around 7-8 GPUs, especially if you are using a PCIe switch.
The system I am discussing has 5 GPUs. Paradoxically, the one with 7 GPUs doesn't seem to have these issues. And the wall on those HP mobos is definitely 7 GPUs.
ID: 2032763 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2032764 - Posted: 17 Feb 2020, 4:48:02 UTC - in response to Message 2032508.  
Last modified: 17 Feb 2020, 4:49:40 UTC

That's a nasty one, but - I think - different.

The first task wrote
<message>Process still present 5 min after writing finish file; aborting</message>
and then went on to write a normal std_err, right down to 'called boinc_finish(0)'
...

As the comment says, "it must be hung somewhere in boinc_finish()" - very late in boinc_finish, if it wrote the file over five minutes ago. But it would be a normal part of BOINC's exit function to copy std_err.txt from the slot folder into client_state.xml, so that it's reported...
Point being that it seems that this is where the driver gets crashed, not just the client.

@Juan, had another one this morning, but sorry, wasn't awake enough to get the slots info you asked for ... I'll keep trying.
ID: 2032764 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2032768 - Posted: 17 Feb 2020, 5:31:04 UTC - in response to Message 2032763.  

If you check, Jimbocous is running 7 GPUs on an older system probably not designed to run that many GPUs. On the Systems not designed for it, you hit a wall around 7-8 GPUs, especially if you are using a PCIe switch.
The system I am discussing has 5 GPUs. Paradoxically, the one with 7 GPUs doesn't seem to have these issues. And the wall on those HP mobos is definitely 7 GPUs.
Hmmm, OK, it does seem to be an older one now that I look at it. But it does say 7, with the same error;
https://setiathome.berkeley.edu/result.php?resultid=8485188776
<message>
Process still present 5 min after writing finish file; aborting</message>
<stderr_txt>
setiathome_CUDA: Found 7 CUDA device(s):
Device 1: GeForce GTX 980, 4043 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 5, pciSlotID = 0
Device 2: GeForce GTX 980, 4043 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 6, pciSlotID = 0
Device 3: GeForce GTX 980, 4043 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 7, pciSlotID = 0
Device 4: GeForce GTX 980, 4043 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 8, pciSlotID = 0
Device 5: GeForce GTX 980, 4043 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 15, pciSlotID = 0
Device 6: GeForce GTX 980, 4043 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 28, pciSlotID = 0
Device 7: GeForce GTX 980, 4040 MiB, regsPerBlock 65536
computeCap 5.2, multiProcs 16
pciBusID = 40, pciSlotID = 0
ID: 2032768 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2032770 - Posted: 17 Feb 2020, 5:39:30 UTC - in response to Message 2032768.  

Hmmm, OK, it does seem to be an older one now that I look at it. But it does say 7, with the same error
Would seem to argue against any mobo GPU count hard limit as being the issue ...
ID: 2032770 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2032772 - Posted: 17 Feb 2020, 5:49:03 UTC - in response to Message 2032770.  

On the other one with 5 GPUs, it seems you are getting Overflows while your Wingmen aren't, https://setiathome.berkeley.edu/results.php?hostid=8859436&state=6
ID: 2032772 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2032773 - Posted: 17 Feb 2020, 5:57:06 UTC - in response to Message 2032772.  
Last modified: 17 Feb 2020, 5:59:32 UTC

On the other one with 5 GPUs, it seems you are getting Overflows while your Wingmen aren't, https://setiathome.berkeley.edu/results.php?hostid=8859436&state=6

If you've been following my discussion of this, you'll note that the overflows are all subsequent to the initial failure, as a result of the NV driver no longer knowing the card exists.
ID: 2032773 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2032775 - Posted: 17 Feb 2020, 6:14:57 UTC - in response to Message 2032773.  

Have you tried it with a different driver? I see 390 is in the repository, and that will work with cuda 9.0. I remember running 396 for quite a while if that driver is possible for you.
ID: 2032775 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2032779 - Posted: 17 Feb 2020, 8:11:01 UTC - in response to Message 2032754.  
Last modified: 17 Feb 2020, 8:12:24 UTC

let me know when you find one using an app that I compiled. would be interesting to see if it happens on mine too. so far they've all been on yours if i'm not mistaken. but then again, theres a lot more people running yours so i understand the statistics surrounding that.
First time it ran;
https://setiathome.berkeley.edu/workunit.php?wuid=3881506537
   Device 1: GeForce GTX 960 is okay
SETI@home using CUDA accelerated device GeForce GTX 960
Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1

-------------------------------------------------------------
SETI@home v8 enhanced x41p_V0.99b1p3, CUDA 10.2 special (MPT)
-------------------------------------------------------------------------
Modifications done by petri33, Mutex by Oddbjornik. Compiled by Ian (^_^)
-------------------------------------------------------------------------

Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.009216
Sigma 147
Sigma > GaussTOffsetStop: 147 > -83
Thread call stack limit is: 1k
Acquired CUDA mutex at 02:45:44,944
Triplet: peak=10.95254, time=45.38, period=17.25, d_freq=1419703159.85, chirp=2.5334, fft_len=512 
Triplet: peak=14.3314, time=91.87, period=11.38, d_freq=1419710779.64, chirp=11.735, fft_len=128 
Autocorr: peak=18.13648, time=6.711, delay=4.2569, d_freq=1419707130.68, chirp=14.817, fft_len=128k
Autocorr: peak=18.02195, time=6.711, delay=4.2569, d_freq=1419707130.69, chirp=14.818, fft_len=128k
Spike: peak=24.35987, time=46.98, d_freq=1419704560.3, chirp=-21.401, fft_len=128k
Spike: peak=25.25349, time=46.98, d_freq=1419704560.3, chirp=-21.406, fft_len=128k
Triplet: peak=10.21009, time=27.47, period=23.44, d_freq=1419710295.47, chirp=-36.005, fft_len=1024 
Triplet: peak=12.34042, time=43.65, period=3.971, d_freq=1419706713.34, chirp=85.346, fft_len=128 
Triplet: peak=11.25078, time=43.65, period=3.971, d_freq=1419706706.88, chirp=86.946, fft_len=128 
Normal release of CUDA mutex after 132.333 seconds at 02:47:57,277

Best spike: peak=25.25349, time=46.98, d_freq=1419704560.3, chirp=-21.406, fft_len=128k
Best autocorr: peak=18.13648, time=6.711, delay=4.2569, d_freq=1419707130.68, chirp=14.817, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.121e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=0, time=-2.121e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 
Best triplet: peak=14.3314, time=91.87, period=11.38, d_freq=1419710779.64, chirp=11.735, fft_len=128 

Spike count:    2
Autocorr count: 2
Pulse count:    0
Triplet count:  5
Gaussian count: 0
The correct result is;
Best pulse: peak=3.513659, time=53.74, period=8.651, d_freq=1419708981.7, score=1.077, chirp=13.402, fft_len=1024 
Spike count:    2
Autocorr count: 2
Pulse count:    9
Triplet count:  5
Gaussian count: 0
This will probably be Invalid. The machine is using a new BioStar Mining board with an eBayed i5. It's the same machine I ran the Development version of 20.04 on a few days ago. The MP version failed in the Benchmark App, it ran through in seconds without finding any signals, so, I didn't try it in BOINC.
ID: 2032779 · Report as offensive     Reply Quote
Previous · 1 . . . 154 · 155 · 156 · 157 · 158 · 159 · 160 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.