Setting up Linux to crunch CUDA90 and above for Windows users

Author	Message
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 2032694 - Posted: 16 Feb 2020, 17:14:24 UTC - in response to Message 2032693. I edited my post. I never had api version in the app_info either. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 2032694 · Reply Quote

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 2032698 - Posted: 16 Feb 2020, 17:31:41 UTC - in response to Message 2032694. I edited my post. I never had api version in the app_info either. And I've edited mine. It probably doesn't matter for personal use, if you always keep the build API pretty close to the version of the BOINC client you run - but worth knowing about if you build for wider distribution. ID: 2032698 · Reply Quote

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2032699 - Posted: 16 Feb 2020, 18:19:27 UTC - in response to Message 2032698. Last modified: 16 Feb 2020, 19:19:02 UTC If you can convince me Jason was wrong for wanting to stay around 6.x, and Petri is wrong for continuing to use 7.5, I might change it. I did try 7.11 for a while, but then decided it would be best to use what Petri is using. If you check, Jimbocous is running 7 GPUs on an older system probably not designed to run that many GPUs. On the Systems not designed for it, you hit a wall around 7-8 GPUs, especially if you are using a PCIe switch. Ask Tom M about that. If you want to run that many GPUs you need a board designed for it, either an Expensive Server board or a relatively cheap Mining board. For the price, the Mining board works for me. BTW, ASUS is Still updating their Mining board, look what the Updates have mentioned; 2019/10/21 5.64 MBytes B250 MINING EXPERT BIOS 1208 Improved system compatibility 2019/03/15 5.64 MBytes B250 MINING EXPERT BIOS 1207 Improve system security and stability 2018/07/20 5.64 MBytes B250 MINING EXPERT BIOS 1206 1.Improve system compatibility and stability 2018/06/01 5.64 MBytes B250 MINING EXPERT BIOS 1205 Intel New ME Update, Improve Stability. 2018/04/12 5.68 MBytes B250 MINING EXPERT BIOS 1010 1. Improve memory compatibility. 2018/03/23 5.68 MBytes B250 MINING EXPERT BIOS 1006 1. Update CPU Microcode 2. Improve system security and stability It keeps going, this was the second one they released; 2017/10/09 5.67 MBytes B250 MINING EXPERT BIOS 0401 Improve system stability ID: 2032699 · Reply Quote

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 2032700 - Posted: 16 Feb 2020, 18:44:56 UTC - in response to Message 2032699. Last modified: 16 Feb 2020, 18:49:09 UTC some people just have an "if it ain't broke, dont fix it" mentality. works fine for a while, until the world moves on around you and your outdated platform starts to break. Notice all the 6.x guys recently suddenly asking why they can't communicate with the project due to outdated security certificates. needing to jump through hoops just to keep it going, some not even able and being forced to upgrade. you yourself have even commented on petri's idiosyncrasies. nothing wrong with it, and petri is more than capable of handing any issues that might come up. I fall into this sometimes since my systems are so stable, i usually don't do any unnecessary updates, which keeps me from having a lot of the issues like systems breaking from a driver update or something similar. but I do it within reason, and update after the seas are calm. same reason i've avoided the Ubuntu short term releases, but I will move to 20.04 LTS when it's proven stable. my 7x 2070 system on the z270 platform is very similar to your 14-gpu B250 mining system in the way they work and components used. total system cost is likely within a few hundred dollars. doing more work with half as many cards, less power draw, and less headaches. I'll take that any day. I may update that system soon since I just replaced my z370 based test bench with my backup board for the watercooled system. I could take the z370 board, drop in a relatively cheap 12-thread CPU, and add another GPU or 2 to the system. still undecided. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 2032700 · Reply Quote

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 2032701 - Posted: 16 Feb 2020, 18:50:59 UTC - in response to Message 2032699. I think Jason was hoping to persuade BOINC to leave runtime estimation as a client function, through submitting code (which I tested for a while in a working client) to maintain separate DCF values for each application version. I would agree with him, because the current runtime estimation server-side code suffers badly from boundary-condition errors (new installations and new application versions) and from slow response times. But that was 10 years ago, and the world has moved on - we are stuck, for better or worse, with CreditNew and the associated runtime estimation model. I would suggest that it's better to stay in touch with the newer BOINC developments: after all, the SETI (science) applications have no productive purpose outside the BOINC framework, and in my view it's better to work *with* BOINC than to keep re-running ten-year-old battles against it. ID: 2032701 · Reply Quote

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 2032703 - Posted: 16 Feb 2020, 19:10:30 UTC - in response to Message 2032701. I think Jason was hoping to persuade BOINC to leave runtime estimation as a client function, through submitting code (which I tested for a while in a working client) to maintain separate DCF values for each application version. I would agree with him, because the current runtime estimation server-side code suffers badly from boundary-condition errors (new installations and new application versions) and from slow response times. But that was 10 years ago, and the world has moved on - we are stuck, for better or worse, with CreditNew and the associated runtime estimation model. I would suggest that it's better to stay in touch with the newer BOINC developments: after all, the SETI (science) applications have no productive purpose outside the BOINC framework, and in my view it's better to work *with* BOINC than to keep re-running ten-year-old battles against it. 100% agree Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 2032703 · Reply Quote

Phud Redux Send message Joined: 20 Apr 16 Posts: 270 Credit: 2,976,272 RAC: 1	Message 2032711 - Posted: 16 Feb 2020, 20:46:39 UTC Could one of you fine gentlemen please post the links to Linux apps? Please and many thanks! ID: 2032711 · Reply Quote

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2032713 - Posted: 16 Feb 2020, 21:04:18 UTC . . HEY GUYS!!! . . This discussion has changed this thread to a Developer's forum which was never its intention. Some poor Linux newbie who wanders into this will be "WTF!" It was meant to provided user level support so people could migrate to a more productive platform ... So please remember the KISS principle ... Stephen < shrug > ID: 2032713 · Reply Quote

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 2032714 - Posted: 16 Feb 2020, 21:06:40 UTC - in response to Message 2032711. Last modified: 16 Feb 2020, 21:09:08 UTC Could one of you fine gentlemen please post the links to Linux apps? Please and many thanks! First I would recommend that you upgrade to the 440.xx drivers, so that you can use the CUDA 10.2 builds. it'll be a benefit for your Pascal and Turing based cards. after that, you can get Tbar's V0.98 builds here: http://www.arkayn.us/lunatics/BOINC.7z this is a whole "All-in-One" BOINC package, you do not need the repository install, and should remove it before using this. if you want to try out using a newer V0.99 mutex enabled build (pre-loading a second WU, eliminates downtime between WUs) you can use my build here: https://setiathome.berkeley.edu/forum_thread.php?id=84933 but I would recommend waiting a few hours. I just made several new builds that should be faster and give people more flexibility on supported hardware, running them through my test suite now. I would download Tbar's package there first and get up and running with it. then after I gather my test results you can decide if you want to modify it with a different app. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 2032714 · Reply Quote

Phud Redux Send message Joined: 20 Apr 16 Posts: 270 Credit: 2,976,272 RAC: 1	Message 2032715 - Posted: 16 Feb 2020, 21:18:34 UTC - in response to Message 2032714. Thanks. ID: 2032715 · Reply Quote

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2032751 - Posted: 17 Feb 2020, 2:40:55 UTC Ha, found another one, https://setiathome.berkeley.edu/workunit.php?wuid=3886727662 Device 1: GeForce GTX 750 Ti is okay SETI@home using CUDA accelerated device GeForce GTX 750 Ti Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.011773 Sigma 62 Sigma > GaussTOffsetStop: 62 > 2 Thread call stack limit is: 1k Pulse: peak=10.53272, time=45.86, period=30.11, d_freq=7696464951.88, score=1.057, chirp=-2.8368, fft_len=1024 setiathome_CUDA: Found 1 CUDA device(s): Device 1: GeForce GTX 750 Ti, 1999 MiB, regsPerBlock 65536 computeCap 5.0, multiProcs 5 pciBusID = 1, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 1 setiathome_CUDA: CUDA Device 1 specified, checking... Device 1: GeForce GTX 750 Ti is okay SETI@home using CUDA accelerated device GeForce GTX 750 Ti Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.011773 Sigma 62 Sigma > GaussTOffsetStop: 62 > 2 Thread call stack limit is: 1k Triplet: peak=11.00201, time=29.44, period=28.58, d_freq=7696462309.32, chirp=14.391, fft_len=256 Spike: peak=24.16754, time=42.95, d_freq=7696464873.91, chirp=18.156, fft_len=64k Triplet: peak=11.72892, time=59.15, period=6.845, d_freq=7696464753.79, chirp=70.223, fft_len=1024 Triplet: peak=10.59451, time=55.71, period=16.65, d_freq=7696462779.31, chirp=-96.307, fft_len=128 Best spike: peak=24.16754, time=42.95, d_freq=7696464873.91, chirp=18.156, fft_len=64k Best autocorr: peak=17.2569, time=74.45, delay=3.2613, d_freq=7696461019.14, chirp=3.9728, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.72892, time=59.15, period=6.845, d_freq=7696464753.79, chirp=70.223, fft_len=1024 Spike count: 1 Autocorr count: 0 Pulse count: 0 Triplet count: 3 Gaussian count: 0 The correct result; Best pulse: peak=2.095706, time=45.86, period=3.081, d_freq=7696462738.7, score=1.068, chirp=-50.368, fft_len=1024 Spike count: 1 Autocorr count: 0 Pulse count: 19 Triplet count: 3 Gaussian count: 0 It found a pulse before the reboot, then Missed them All afterwards. Same as on a Mac. ID: 2032751 · Reply Quote

Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640	Message 2032754 - Posted: 17 Feb 2020, 2:50:22 UTC - in response to Message 2032751. let me know when you find one using an app that I compiled. would be interesting to see if it happens on mine too. so far they've all been on yours if i'm not mistaken. but then again, theres a lot more people running yours so i understand the statistics surrounding that. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ID: 2032754 · Reply Quote

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 2032763 - Posted: 17 Feb 2020, 4:44:10 UTC - in response to Message 2032699. If you check, Jimbocous is running 7 GPUs on an older system probably not designed to run that many GPUs. On the Systems not designed for it, you hit a wall around 7-8 GPUs, especially if you are using a PCIe switch. The system I am discussing has 5 GPUs. Paradoxically, the one with 7 GPUs doesn't seem to have these issues. And the wall on those HP mobos is definitely 7 GPUs. ID: 2032763 · Reply Quote

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 2032764 - Posted: 17 Feb 2020, 4:48:02 UTC - in response to Message 2032508. Last modified: 17 Feb 2020, 4:49:40 UTC That's a nasty one, but - I think - different. The first task wrote <message>Process still present 5 min after writing finish file; aborting</message> and then went on to write a normal std_err, right down to 'called boinc_finish(0)' ... As the comment says, "it must be hung somewhere in boinc_finish()" - very late in boinc_finish, if it wrote the file over five minutes ago. But it would be a normal part of BOINC's exit function to copy std_err.txt from the slot folder into client_state.xml, so that it's reported... Point being that it seems that this is where the driver gets crashed, not just the client. @Juan, had another one this morning, but sorry, wasn't awake enough to get the slots info you asked for ... I'll keep trying. ID: 2032764 · Reply Quote

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2032768 - Posted: 17 Feb 2020, 5:31:04 UTC - in response to Message 2032763. If you check, Jimbocous is running 7 GPUs on an older system probably not designed to run that many GPUs. On the Systems not designed for it, you hit a wall around 7-8 GPUs, especially if you are using a PCIe switch. The system I am discussing has 5 GPUs. Paradoxically, the one with 7 GPUs doesn't seem to have these issues. And the wall on those HP mobos is definitely 7 GPUs. Hmmm, OK, it does seem to be an older one now that I look at it. But it does say 7, with the same error; https://setiathome.berkeley.edu/result.php?resultid=8485188776 <message> Process still present 5 min after writing finish file; aborting</message> <stderr_txt> setiathome_CUDA: Found 7 CUDA device(s): Device 1: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 5, pciSlotID = 0 Device 2: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 6, pciSlotID = 0 Device 3: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 7, pciSlotID = 0 Device 4: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 8, pciSlotID = 0 Device 5: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 15, pciSlotID = 0 Device 6: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 28, pciSlotID = 0 Device 7: GeForce GTX 980, 4040 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 40, pciSlotID = 0 ID: 2032768 · Reply Quote

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 2032770 - Posted: 17 Feb 2020, 5:39:30 UTC - in response to Message 2032768. Hmmm, OK, it does seem to be an older one now that I look at it. But it does say 7, with the same error Would seem to argue against any mobo GPU count hard limit as being the issue ... ID: 2032770 · Reply Quote

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2032772 - Posted: 17 Feb 2020, 5:49:03 UTC - in response to Message 2032770. On the other one with 5 GPUs, it seems you are getting Overflows while your Wingmen aren't, https://setiathome.berkeley.edu/results.php?hostid=8859436&state=6 ID: 2032772 · Reply Quote

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 2032773 - Posted: 17 Feb 2020, 5:57:06 UTC - in response to Message 2032772. Last modified: 17 Feb 2020, 5:59:32 UTC On the other one with 5 GPUs, it seems you are getting Overflows while your Wingmen aren't, https://setiathome.berkeley.edu/results.php?hostid=8859436&state=6 If you've been following my discussion of this, you'll note that the overflows are all subsequent to the initial failure, as a result of the NV driver no longer knowing the card exists. ID: 2032773 · Reply Quote

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2032775 - Posted: 17 Feb 2020, 6:14:57 UTC - in response to Message 2032773. Have you tried it with a different driver? I see 390 is in the repository, and that will work with cuda 9.0. I remember running 396 for quite a while if that driver is possible for you. ID: 2032775 · Reply Quote

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2032779 - Posted: 17 Feb 2020, 8:11:01 UTC - in response to Message 2032754. Last modified: 17 Feb 2020, 8:12:24 UTC let me know when you find one using an app that I compiled. would be interesting to see if it happens on mine too. so far they've all been on yours if i'm not mistaken. but then again, theres a lot more people running yours so i understand the statistics surrounding that. First time it ran; https://setiathome.berkeley.edu/workunit.php?wuid=3881506537 Device 1: GeForce GTX 960 is okay SETI@home using CUDA accelerated device GeForce GTX 960 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 ------------------------------------------------------------- SETI@home v8 enhanced x41p_V0.99b1p3, CUDA 10.2 special (MPT) ------------------------------------------------------------------------- Modifications done by petri33, Mutex by Oddbjornik. Compiled by Ian (^_^) ------------------------------------------------------------------------- Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.009216 Sigma 147 Sigma > GaussTOffsetStop: 147 > -83 Thread call stack limit is: 1k Acquired CUDA mutex at 02:45:44,944 Triplet: peak=10.95254, time=45.38, period=17.25, d_freq=1419703159.85, chirp=2.5334, fft_len=512 Triplet: peak=14.3314, time=91.87, period=11.38, d_freq=1419710779.64, chirp=11.735, fft_len=128 Autocorr: peak=18.13648, time=6.711, delay=4.2569, d_freq=1419707130.68, chirp=14.817, fft_len=128k Autocorr: peak=18.02195, time=6.711, delay=4.2569, d_freq=1419707130.69, chirp=14.818, fft_len=128k Spike: peak=24.35987, time=46.98, d_freq=1419704560.3, chirp=-21.401, fft_len=128k Spike: peak=25.25349, time=46.98, d_freq=1419704560.3, chirp=-21.406, fft_len=128k Triplet: peak=10.21009, time=27.47, period=23.44, d_freq=1419710295.47, chirp=-36.005, fft_len=1024 Triplet: peak=12.34042, time=43.65, period=3.971, d_freq=1419706713.34, chirp=85.346, fft_len=128 Triplet: peak=11.25078, time=43.65, period=3.971, d_freq=1419706706.88, chirp=86.946, fft_len=128 Normal release of CUDA mutex after 132.333 seconds at 02:47:57,277 Best spike: peak=25.25349, time=46.98, d_freq=1419704560.3, chirp=-21.406, fft_len=128k Best autocorr: peak=18.13648, time=6.711, delay=4.2569, d_freq=1419707130.68, chirp=14.817, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.121e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.121e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=14.3314, time=91.87, period=11.38, d_freq=1419710779.64, chirp=11.735, fft_len=128 Spike count: 2 Autocorr count: 2 Pulse count: 0 Triplet count: 5 Gaussian count: 0 The correct result is; Best pulse: peak=3.513659, time=53.74, period=8.651, d_freq=1419708981.7, score=1.077, chirp=13.402, fft_len=1024 Spike count: 2 Autocorr count: 2 Pulse count: 9 Triplet count: 5 Gaussian count: 0 This will probably be Invalid. The machine is using a new BioStar Mining board with an eBayed i5. It's the same machine I ran the Development version of 20.04 on a few days ago. The MP version failed in the Benchmark App, it ran through in seconds without finding any signals, so, I didn't try it in BOINC. ID: 2032779 · Reply Quote

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.