Message boards :
Number crunching :
New CUDA 10.2 Linux App Available
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
OK, the 6700K finally finished the CPU tasks, so, I stopped BOINC and ran the GPU tasks on a GTX1060 using -nobs. I removed some extraneous lines to make it shorter. Basically the same results, all Apps failed on the one task; KWSN-Linux-MBbench v2.1.08 Running on TBar-iSETI at Sun 08 Dec 2019 06:10:11 AM UTC ---------------------------------------------------------------- Starting benchmark run... ---------------------------------------------------------------- Listing wu-file(s) in /testWUs : blc14_2bit_guppi_58691_83520_HIP79781_0103.15702.0.21.44.152.vlar.wu blc14_2bit_guppi_58691_83520_HIP79781_0103.25370.0.22.45.105.vlar.wu blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu blc14_2bit_guppi_58692_02937_HIP79792_0121.10280.409.21.44.21.vlar.wu blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu Listing executable(s) in /APPS : setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90 Listing executable in /REF_APPS : MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu ---------------------------------------------------------------- Current WU: blc14_2bit_guppi_58691_83520_HIP79781_0103.15702.0.21.44.152.vlar.wu ---------------------------------------------------------------- Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s) Elapsed Time: ....................... 4038 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 -nobs -device 1 Elapsed Time : ...................... 150 seconds Speed compared to default : ......... 2692 % ----------------- Comparing results Result : Strongly similar, Q= 99.27% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -nobs -device 1 Elapsed Time : ...................... 149 seconds Speed compared to default : ......... 2710 % ----------------- Comparing results Result : Strongly similar, Q= 99.27% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90 -nobs -device 0 Elapsed Time : ...................... 154 seconds Speed compared to default : ......... 2622 % ----------------- Comparing results Result : Strongly similar, Q= 99.27% ---------------------------------------------------------------- Done with blc14_2bit_guppi_58691_83520_HIP79781_0103.15702.0.21.44.152.vlar.wu ==================================================================== Current WU: blc14_2bit_guppi_58691_83520_HIP79781_0103.25370.0.22.45.105.vlar.wu ---------------------------------------------------------------- Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s) Elapsed Time: ....................... 4126 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 -nobs -device 1 Elapsed Time : ...................... 150 seconds Speed compared to default : ......... 2750 % ----------------- Comparing results Result : Strongly similar, Q= 99.76% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -nobs -device 1 Elapsed Time : ...................... 149 seconds Speed compared to default : ......... 2769 % ----------------- Comparing results Result : Strongly similar, Q= 99.76% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90 -nobs -device 1 Elapsed Time : ...................... 154 seconds Speed compared to default : ......... 2679 % ----------------- Comparing results Result : Strongly similar, Q= 99.76% ---------------------------------------------------------------- Done with blc14_2bit_guppi_58691_83520_HIP79781_0103.25370.0.22.45.105.vlar.wu ==================================================================== Current WU: blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu ---------------------------------------------------------------- Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s) Elapsed Time: ....................... 3920 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 -nobs -device 1 Elapsed Time : ...................... 150 seconds Speed compared to default : ......... 2613 % ----------------- Comparing results Result : Strongly similar, Q= 99.37% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -nobs -device 1 Elapsed Time : ...................... 150 seconds Speed compared to default : ......... 2613 % ----------------- Comparing results Result : Strongly similar, Q= 99.37% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90 -nobs -device 1 Elapsed Time : ...................... 154 seconds Speed compared to default : ......... 2545 % ----------------- Comparing results Result : Strongly similar, Q= 99.37% ---------------------------------------------------------------- Done with blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu ==================================================================== Current WU: blc14_2bit_guppi_58692_02937_HIP79792_0121.10280.409.21.44.21.vlar.wu ---------------------------------------------------------------- Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s) Elapsed Time: ....................... 3207 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 -nobs -device 1 Elapsed Time : ...................... 115 seconds Speed compared to default : ......... 2788 % ----------------- Comparing results ------------- R1:R2 ------------ ------------- R2:R1 ------------ Exact Super Tight Good Bad Exact Super Tight Good Bad Spike 0 0 0 0 0 0 0 0 0 0 Autocorr 0 1 1 1 0 0 1 1 1 0 Gaussian 0 0 0 0 0 0 0 0 0 0 Pulse 0 4 4 4 1 0 4 4 4 0 Triplet 0 1 1 1 0 0 1 1 1 0 Best Spike 0 1 1 1 0 0 1 1 1 0 Best Autocorr 0 1 1 1 0 0 1 1 1 0 Best Gaussian 1 1 1 1 0 1 1 1 1 0 Best Pulse 0 1 1 1 0 0 1 1 1 0 Best Triplet 0 1 1 1 0 0 1 1 1 0 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1 11 11 11 1 1 11 11 11 0 Unmatched signal(s) in R1 at line(s) 370 For R1:R2 matched signals only, Q= 99.75% Result : Weakly similar. ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -nobs -device 1 Elapsed Time : ...................... 116 seconds Speed compared to default : ......... 2764 % ----------------- Comparing results ------------- R1:R2 ------------ ------------- R2:R1 ------------ Exact Super Tight Good Bad Exact Super Tight Good Bad Spike 0 0 0 0 0 0 0 0 0 0 Autocorr 0 1 1 1 0 0 1 1 1 0 Gaussian 0 0 0 0 0 0 0 0 0 0 Pulse 0 4 4 4 1 0 4 4 4 0 Triplet 0 1 1 1 0 0 1 1 1 0 Best Spike 0 1 1 1 0 0 1 1 1 0 Best Autocorr 0 1 1 1 0 0 1 1 1 0 Best Gaussian 1 1 1 1 0 1 1 1 1 0 Best Pulse 0 1 1 1 0 0 1 1 1 0 Best Triplet 0 1 1 1 0 0 1 1 1 0 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1 11 11 11 1 1 11 11 11 0 Unmatched signal(s) in R1 at line(s) 370 For R1:R2 matched signals only, Q= 99.75% Result : Weakly similar. ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90 -nobs -device 1 Elapsed Time : ...................... 118 seconds Speed compared to default : ......... 2717 % ----------------- Comparing results ------------- R1:R2 ------------ ------------- R2:R1 ------------ Exact Super Tight Good Bad Exact Super Tight Good Bad Spike 0 0 0 0 0 0 0 0 0 0 Autocorr 0 1 1 1 0 0 1 1 1 0 Gaussian 0 0 0 0 0 0 0 0 0 0 Pulse 0 4 4 4 1 0 4 4 4 0 Triplet 0 1 1 1 0 0 1 1 1 0 Best Spike 0 1 1 1 0 0 1 1 1 0 Best Autocorr 0 1 1 1 0 0 1 1 1 0 Best Gaussian 1 1 1 1 0 1 1 1 1 0 Best Pulse 0 1 1 1 0 0 1 1 1 0 Best Triplet 0 1 1 1 0 0 1 1 1 0 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1 11 11 11 1 1 11 11 11 0 Unmatched signal(s) in R1 at line(s) 370 For R1:R2 matched signals only, Q= 99.75% Result : Weakly similar. ---------------------------------------------------------------- Done with blc14_2bit_guppi_58692_02937_HIP79792_0121.10280.409.21.44.21.vlar.wu ==================================================================== Current WU: blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu ---------------------------------------------------------------- Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s) Elapsed Time: ....................... 4033 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 -nobs -device 1 Elapsed Time : ...................... 137 seconds Speed compared to default : ......... 2943 % ----------------- Comparing results Result : Strongly similar, Q= 99.73% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda102 -nobs -device 1 Elapsed Time : ...................... 136 seconds Speed compared to default : ......... 2965 % ----------------- Comparing results Result : Strongly similar, Q= 99.73% ---------------------------------------------------------------- Running app with command : .......... setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90 -nobs -device 1 Elapsed Time : ...................... 142 seconds Speed compared to default : ......... 2840 % ----------------- Comparing results Result : Strongly similar, Q= 99.73% ---------------------------------------------------------------- Done with blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu ====================================================================It's a shame the hyper-threaded 6700K is so much slower than the non-hyper-threaded 6600. I'm pretty sure I know why some of the 10.1 builds occasionally miss a pulse more often than others, but it seems you still miss one on occasion with any build...must be something in the code. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
It's a shame the hyper-threaded 6700K is so much slower than the non-hyper-threaded 6600. I'm pretty sure I know why some of the 10.1 builds occasionally miss a pulse more often than others, but it seems you still miss one on occasion with any build...must be something in the code. TBar, would you care to elaborate on your suspicion of what is the cause of missing a pulse. I have wanted to figure out why my zi3v app most of the time under reports the pulse count compared to the cpu app or the SoG app. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
If you are referring to your Tegra X1 I don't think it will make much difference. If you used anything lower than S_Limit 191 when compiling the App, try it with 191. It seems CUDA 10.x doesn't like the same settings that work in CUDA 9.x. The problem I saw in CUDA 10.x was rare, your problem with the Tetra isn't rare. I don't think the SETI code will work with the Tegra X1 using CUDA 10, shame you can't use something like CUDA 6.5 on that device. There was a reason Jason stopped the Windows CUDA Apps at 5.0, anything higher just wasn't any better. Sometimes higher isn't better, the highest CUDA that will work correctly with the SETI code on a Kepler cc3.5 GPU is CUDA 6.0. |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 ![]() |
Kinda hard to tell what's going on in that format. The one you should be concerned with is the one that contains "Weakly similar". That means the results didn't match. It would help if you ran that task in the Lunatics Benchmark App with a CPU App in REF_APPs and the GPU Apps in APPS. The Terminal readout will be much easier to understand and identify the App that doesn't match the CPU App. I'm running it on another machine right now, but the CPU tasks will take a little while. The version I built and tested with that benchmark used only Tbar's Xbranch and the new NVdia 10.2 library. At that time I did not have access to Petri's new code which includes the mutex routines. As a result the benchmarks are based only on Petri's code written 4/15/2019 or earlier, AFAICT. Not sure how useful it is, but I put a script here to easily make the necessary changes to Tbar's makefile to build on any (?) host. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
I built a mutex cuda 10.2 app, it's available in the other thread if you want it, but it likely wont show much difference in benchmarks unless you can running it over and over and over and run 2 WUs per GPU on the mutex app in an automated fashion so that you can see the differences. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
CUDA 10.2 drivers have been in the Ubuntu drivers PPA since driver 440.26 (added mid October). It is now up to date with driver 440.36. The PPA is far easier to use on Ubuntu systems.What ppa are you using? The graphics-drivers ppa my system is using doesn't have anything higher than 430.40 and I have been using 418.56 because that 430 made my computer unable to boot when I tried it a few months ago. Recovering from that was a big pain :( Also that 418 is older than the gpu in one of my computers, so it doesn't know the name of it and calls it just 'Graphics Device' ;) A newer driver would be nice if it actually worked. Especially if it fixed a bug in 418.56 that occasionally makes it enter infinite loop in kernel code when running the special app. When a process becomes unresponsive in a kernel call, there's no way to kill it short of system reboot :( |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
you may need to update your version of Ubuntu to 18.04 or higher. I use the same PPA you linked. but I believe the newer drivers are version locked. otherwise you will need to use the nvidia runfile installer. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
![]() Send message Joined: 28 Apr 00 Posts: 35 Credit: 128,746,856 RAC: 230 ![]() ![]() |
As far as the Release Notes, this is All you need to read, "Note that support for these compute capabilities may be removed in a future release of CUDA." That means Maxwell will still work with 10.2 , but MAY be removed on Newer ToolKits. nVidia has been complaining about Clang 6.0 since ToolKit 10.0, but, it still works with ToolKit 10.2 even though it has been listed as 'deprecated' for some time now.The CUDA 10.2 special app throws SIGSEGV: segmentation violationerrors on one of my systems: Libraries: libcurl/7.58.0 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3 CUDA: NVIDIA GPU 0: GeForce GTX TITAN X (driver version 440.44, CUDA version 10.2, compute capability 5.2, 4096MB, 4006MB available, 6611 GFLOPS peak) OpenCL: NVIDIA GPU 0: GeForce GTX TITAN X (driver version 440.44, device version OpenCL 1.2 CUDA, 12210MB, 4006MB available, 6611 GFLOPS peak) OS: Linux Ubuntu: Ubuntu 18.04.3 LTS [5.0.0-37-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]The Nvidia driver is from the graphics-drivers ppa. The CUDA 10.1 special app is working fine on the same system. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
Retvari, can you try my version of the CUDA 10.2 app? I built it with gencode for CC 5.0, but I do not have any Maxwell cards to try it. please let me know if it works. Download here: https://drive.google.com/open?id=1hTuDeZhtGDwyEiVQnsUZfZvcvrFZIPF6 this is a mutex build, but the mutex function will not be used if you are set to only run 1 WU at a time. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
![]() Send message Joined: 28 Apr 00 Posts: 35 Credit: 128,746,856 RAC: 230 ![]() ![]() |
Your version (the CUDA10.2 mutex build) is working fine on the same host. (I can't give a link to a finished wu atm, as the servers are struggling.) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
As per the ReadMe, the 10.2 version in the All-In-One requires Pascal and Higher. Maxwells run just about as well on CUDA 9.0. If you add Maxwell to the 10.2 App, the App will be around 220MBs instead of around 200MBs. The larger you make the App, the Slower it will run for everyone. For the Newer GPUs it's best to keep Maxwell out of the App. That's why it says Maxwell can use the CUDA 9.0 App, it won't work with the smaller App. |
![]() Send message Joined: 28 Apr 00 Posts: 35 Credit: 128,746,856 RAC: 230 ![]() ![]() |
As per the ReadMe, the 10.2 version in the All-In-One requires Pascal and Higher. Maxwells run just about as well on CUDA 9.0. If you add Maxwell to the 10.2 App, the App will be around 220MBs instead of around 200MBs. The larger you make the App, the Slower it will run for everyone. For the Newer GPUs it's best to keep Maxwell out of the App. That's why it says Maxwell can use the CUDA 9.0 App, it won't work with the smaller App.Thanks for this clarification. I got lost where you said "...that means Maxwell will still work with 10.2, but MAY be removed on Newer ToolKits". In this sentence the 10.2 regards to the CUDA version, not your app. Now it's clear. |
Ian&Steve C. ![]() Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 ![]() ![]() |
Glad to hear that my version of the CUDA 10.2 app is working fine for Maxwell cards too :) Seti@Home classic workunits: 29,492 CPU time: 134,419 hours ![]() ![]() |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
As per the ReadMe, the 10.2 version in the All-In-One requires Pascal and Higher. Maxwells run just about as well on CUDA 9.0. If you add Maxwell to the 10.2 App, the App will be around 220MBs instead of around 200MBs. The larger you make the App, the Slower it will run for everyone. For the Newer GPUs it's best to keep Maxwell out of the App. That's why it says Maxwell can use the CUDA 9.0 App, it won't work with the smaller App.Thanks for this clarification. This is the Full quote; As far as the Release Notes, this is All you need to read, "Note that support for these compute capabilities may be removed in a future release of CUDA." That means Maxwell will still work with 10.2 , but MAY be removed on Newer ToolKits. nVidia has been complaining about Clang 6.0 since ToolKit 10.0, but, it still works with ToolKit 10.2 even though it has been listed as 'deprecated' for some time now. |
Aurum Send message Joined: 1 Jan 19 Posts: 10 Credit: 17,623,160 RAC: 2 ![]() |
Time isn't just flying, it's flying faster each year.That's because each passing day is a smaller and smaller percentage of your memory. ![]() |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
Well, that's just great... Seems every time We get all the Apps working well, SETI changes something that breaks them. Well, seems they broke everything this time :-( I thought about what might be needed Next, and decided the Latest and Greatest version of the Berkeley version of BOINC might be nice. You will need a nicely working version of BOINC , that works in your Home folder, for all those Other Projects. I was hoping to test it with an Overflow Storm to make sure it works with 14 GPUs reporting 100+ tasks every 5 minutes, but, I don't suppose that will happen now. Otherwise, it's been working Great over the past few days of testing. This version, as with the other All-In-Ones, is completely Stock with a few settings changed for better performance on High-End Systems. 1) The Minimum Retry time was changed from 120 sec to 15 secs to keep from having the Uploads/Downloads pile up during Shorty/Overflow Storms. 2) The Minimum Backoff time was changed to 305 sec to also help with Uploads/Downloads backing up. 3) The job_limit was increased to 5000. These changes solve most problems with stuck Uploads/Downloads. Retry times are in seconds, Backoff times are in minutes. It seems to work very well. I also included the New AVX CPU App, and the CUDA 10.2 version that works with Maxwell. The Maxwell version only has precompiled code for Maxwell and Pascal, so, use the larger version of 10.2 for Pascal and Above GPUs. This of course made the file somewhat larger... So, enjoy the Newest version of BOINC, it's working well on My machines. It's in the Same location, http://www.arkayn.us/lunatics/BOINC.7z The App is in the All-In-One package at the same location http://www.arkayn.us/lunatics/BOINC.7z |
![]() ![]() Send message Joined: 1 Dec 11 Posts: 45 Credit: 25,258,781 RAC: 180 ![]() ![]() |
Thanks for the app update T-Bar and everyone that had helped me in the past - I'm way late and let my best workstation sit idle hoping to get back to fixing the app (stopped working a while ago but my other workstations have been running) - Just fixed it with new app and updated drivers/cuda 10.2, for one last run till the end of the month. Wish everyone well James |
Viktor Send message Joined: 27 Sep 01 Posts: 2 Credit: 43,475,391 RAC: 156 ![]() ![]() |
Installed the 10.2 app over the old 10.1. I have maxwell cards. I did the file swap required, I did the file_info and file_ref edits as required. For some reason I can only get boinc/seti to run on 1 out of 2 GPU. Both are recognized in the boinc startup and tasks seem to bounce between them. "use_all_gpus" is enabled in cc_config I am sure I am missing some little something... can anyone point me in the right direction? update: Self diagnosed the issue. When people say "don't copy and paste from windows" that means DON'T COPY AND PASTE FROM WINDOWS. I deleted all whitespace in my cc_config via terminal and it started working. Odd since the boinc log claimed "use all coprocessors" prior to the change. My extreme thanks to all who continue to develop for the project. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.