Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 61 · 62 · 63 · 64 · 65 · 66 · 67 . . . 83 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895598 - Posted: 16 Oct 2017, 0:19:31 UTC - in response to Message 1895589.  

Are we still in pre-production testing for the CUDA 9.0 apps? Or are they usable in production now?

[bump]


. . I think it is still Caveat Emptor ........

Stephen

:)
ID: 1895598 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1895602 - Posted: 16 Oct 2017, 1:05:14 UTC - in response to Message 1895589.  

It depends on how you decipher this post, https://setiathome.berkeley.edu/forum_thread.php?id=80636&postid=1895161#1895161
To me, that sounds like another change, and I'm still testing.
The clincher is, there aren't any CUDA 9 Links at C.A.
ID: 1895602 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1895603 - Posted: 16 Oct 2017, 1:15:51 UTC - in response to Message 1895602.  

My confusion stems from the stderr.txt output I see in tasks processed by xs2 saying application by Petri33 and released to the public by TBar.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1895603 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1895604 - Posted: 16 Oct 2017, 1:20:50 UTC - in response to Message 1895589.  

Are we still in pre-production testing for the CUDA 9.0 apps? Or are they usable in production now?

[bump]
The impression I got from Petri's original post was that any "pre-production testing" was something he expected each individual user to simply perform with offline runs on their own configurations. Then, it would be a judgment call for each user as to when, or whether, the switch-over to production could commence. So, basically, you have to make your own call as to how much offline testing is sufficient before going live. Not an ideal approach, but it is what it is. :^)
ID: 1895604 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895627 - Posted: 16 Oct 2017, 5:15:54 UTC - in response to Message 1895597.  

. . OK TBar,

Things to do:

. . 1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia?

. . 2) Upgrade BOINC to 7.8.2 Q - since I haven't had to upgrade BOINC in the 6 months I have been using Linux is there anything I particularly need to DO/NOT Do/Be aware of? {note- I should probably add CA to my fave's list}

. . 3) Suspend all tasks and swap out GPU

. . 4) Resume one task and see if it runs using 3v. If not suspend (clear results files) then go back to earlier version of Special sauce, or maybe to a CUDA65 version.

. . 5) Once I have WUs crunching successfully with the GT730 then run a few through and gather baseline data.

. . 6) Install the test version (CUDA90?) and compare results.

. . 7) Tell TBar what the results are.!

. . Have I missed anything ???

. . If nothing else it will be a good opportunity to give the rig a spring clean :)

Stephen

:)
ID: 1895627 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1895636 - Posted: 16 Oct 2017, 8:33:04 UTC - in response to Message 1895603.  

My confusion stems from the stderr.txt output I see in tasks processed by xs2 saying application by Petri33 and released to the public by TBar.


If the indicator is "Why am I seeing a huge amount of resends now that the new app has been 'released'?" I would say NO it is not ready yet. I haven't looked at why I seeing so many resends, but I have a feeling I know why ... and they are likely validating against each other ...
ID: 1895636 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895647 - Posted: 16 Oct 2017, 12:27:30 UTC - in response to Message 1895627.  

. . UPDATE:

. . My confidence in Murphy's Law is reinforced.


Things to do:

. . 1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ... :(

. . 2) Upgrade BOINC to 7.8.2 Q - since I haven't had to upgrade BOINC in the 6 months I have been using Linux is there anything I particularly need to DO/NOT Do/Be aware of? {note- I should probably add CA to my fave's list} OK, this is going to be a disaster I feel because this rig is currently running the repository version of BOINC and I do not know how to manually install over the top of that. So I have to ask, is this part absolutely necessary?

. . 3) Suspend all tasks and swap out GPU

. . 4) Resume one task and see if it runs using 3v. If not suspend (clear results files) then go back to earlier version of Special sauce, or maybe to a CUDA65 version.

. . 5) Once I have WUs crunching successfully with the GT730 then run a few through and gather baseline data.

. . 6) Install the test version (CUDA90?) and compare results.

. . 7) Tell TBar what the results are.!

. . Have I missed anything ???

. . If nothing else it will be a good opportunity to give the rig a spring clean :)

Stephen

:)
ID: 1895647 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1895670 - Posted: 16 Oct 2017, 15:29:20 UTC - in response to Message 1895647.  
Last modified: 16 Oct 2017, 15:29:41 UTC

. . UPDATE:

. . My confidence in Murphy's Law is reinforced.


Things to do:

. . 1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ... :(

. . 2) Upgrade BOINC to 7.8.2 Q - since I haven't had to upgrade BOINC in the 6 months I have been using Linux is there anything I particularly need to DO/NOT Do/Be aware of? {note- I should probably add CA to my fave's list} OK, this is going to be a disaster I feel because this rig is currently running the repository version of BOINC and I do not know how to manually install over the top of that. So I have to ask, is this part absolutely necessary?


I just added the graphics driver ppa and updated through the Package Manager. Pretty easy. I don't know, but I wouldn't think Linux downgrades a driver when it updates. At least it always asks whether you want a package uninstalled or installed during an update.

And yes, you would need the 384.90 level Nvidia drivers to support the new Petri/Tbar CUDA 9.0 app.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1895670 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1895689 - Posted: 16 Oct 2017, 16:56:30 UTC - in response to Message 1895647.  

1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ...
I don't have to reinstall the driver. Make sure you have DKMS installed, sudo apt-get install dkms
I did give you a link to the driver. Download the driver, move it to your home folder, and set the execute bit. Drop into the console, login, sudo stop lightdm, remove the repository driver and run autoremove. When you install the nVidia driver choose to register the kernel module and it will be automatically reinstalled during each update. Works for me. If you can't update the version of BOINC, then Don't.

I still don't see a Kepler card in that machine yet. Since I last posted, I have done the following,

1) Replaced a NV card with a ATI card so I could boot to El Capitan
2) Booted to EL Capitan where XCode works with Petri's Code
3) Installed the CUDA 9 ToolKit
4) Tried compiling different versions of boinc-master until I found one that worked. Apparently this Screen Saver Fix Broke the latest boinc-master in OSX
5) Tried for hours to get the Static Libraries to work in OSX...it doesn't work
6) Settled for the last CUDA 9 App that doesn't have Static Libraries, zi3x, which apparently is not any better than zi3v.
7) Swapped cards again, booted back to the OS that supports Pascal, and tested the new zi3x App.
...and I still don't see a Kepler card in Stephen's machine. But hey, I have a new CUDA 9 Mac App that seems to be working as usual.
ID: 1895689 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895749 - Posted: 16 Oct 2017, 22:08:03 UTC - in response to Message 1895689.  

1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ...
I don't have to reinstall the driver. Make sure you have DKMS installed, sudo apt-get install dkms
I did give you a link to the driver. Download the driver, move it to your home folder, and set the execute bit. Drop into the console, login, sudo stop lightdm, remove the repository driver and run autoremove. When you install the nVidia driver choose to register the kernel module and it will be automatically reinstalled during each update. Works for me. If you can't update the version of BOINC, then Don't.

I still don't see a Kepler card in that machine yet. Since I last posted, I have done the following,

1) Replaced a NV card with a ATI card so I could boot to El Capitan
2) Booted to EL Capitan where XCode works with Petri's Code
3) Installed the CUDA 9 ToolKit
4) Tried compiling different versions of boinc-master until I found one that worked. Apparently this Screen Saver Fix Broke the latest boinc-master in OSX
5) Tried for hours to get the Static Libraries to work in OSX...it doesn't work
6) Settled for the last CUDA 9 App that doesn't have Static Libraries, zi3x, which apparently is not any better than zi3v.
7) Swapped cards again, booted back to the OS that supports Pascal, and tested the new zi3x App.
...and I still don't see a Kepler card in Stephen's machine. But hey, I have a new CUDA 9 Mac App that seems to be working as usual.


. . Yep, because unlike you I am scared of Linux because it keeps biting me. I tried upgrading the video drivers on the other rig because having BOINC under the home directory I felt it would be more stable. But no, it wigged out during the process and I had to "login", so I did as myself, turns out I needed to login as root, so that went down the drain. I am in the process of trying to get that unit working again. And since the update to release 97 that trashed the loader forcing me to roll it back to release 96 now it keeps losing the loader and then making the flashdrive disappear. I have to keep moving it from port to port to get to see it again so I can roll it back whenever I have to boot. AAAhhhhhhrrfghhgh! Linux .... :(

Stephen
ID: 1895749 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895750 - Posted: 16 Oct 2017, 22:09:42 UTC - in response to Message 1895670.  


I just added the graphics driver ppa and updated through the Package Manager. Pretty easy. I don't know, but I wouldn't think Linux downgrades a driver when it updates. At least it always asks whether you want a package uninstalled or installed during an update.
And yes, you would need the 384.90 level Nvidia drivers to support the new Petri/Tbar CUDA 9.0 app.


. . may i ask. What does ppa mean?

Stephen

??
ID: 1895750 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1895766 - Posted: 16 Oct 2017, 23:47:14 UTC - in response to Message 1895750.  

It stands for personal package archives in Linux. I found the information about the official graphics ppa here.
ubuntu-official-ppa-graphics
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1895766 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895783 - Posted: 17 Oct 2017, 2:11:37 UTC - in response to Message 1895689.  

I don't have to reinstall the driver. Make sure you have DKMS installed, sudo apt-get install dkms
I did give you a link to the driver. Download the driver, move it to your home folder, and set the execute bit. Drop into the console, login, sudo stop lightdm, remove the repository driver and run autoremove. When you install the nVidia driver choose to register the kernel module and it will be automatically reinstalled during each update. Works for me. If you can't update the version of BOINC, then Don't.
I still don't see a Kepler card in that machine yet. Since I last posted, I have done the following,


. . That was like having every tooth pulled without anaesthetic ...

. . OK ... here's the thing ...

. . I tried to install DKMS as you said, it was already there. But ...

. . When I went through the procedure silly me used the command "sudo apt-get remove --purge nvidia" to which the reponse was nothing to remove ... I should have twigged right then ...

. . When I ran the install THIS TIME it offered the option of registering with DKMS. None of the previous driver installs had done that. So I selected yes and it began compiling the DKMS kernel. But that failed and said I needed other things such as pkg-config. When I tried to install that ... it was already there :(

. . I couldn't boot because I was now in limbo and doing the login screen hop!

. . I tried re-installing the repository drivers but it kept saying they were already the latest, second time around it twigged. So I reran "sudo apt-get remove --purge nvidia-375" and what do you know, that did something.

. . I re-ran the install and this time it offered, what it had always offered on each successful past attempt, to user xconfig to remember the video configuration and as before I said yes. Now there was as always no offer to register with DKMS. I am left wondering if I had said no to the xconfig question would I have been prompted to register with DKMS?? None the less this time the install worked and I was able to reboot OK. Strangely though the reboot command no longer works as it keeps telling me I need to be root, which was not previously the case. So a three finger salute it is then.

. . System reboots and runs with the new drivers but card is getting hot, so I try to manually run the fan control script which crashes and burns. OK, close everything, run coolbits and reboot. Fan control still crashes and burns. OK plan B, now running the xserver interface to run the fans.

. . And I have no intention of trying to update to 7.8.2. even if it was remotely possible.

. . Right now I would like to install 3s-65 and confirm it works with the 1050ti before too many other things change. Then that being successful I will swap out the cards or do you think I should swap out the cards first?

Stephen

:( ??
ID: 1895783 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1895784 - Posted: 17 Oct 2017, 2:22:03 UTC

Stephen, two steps forward and one step back for you seems the norm. Keep the faith!!
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1895784 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895787 - Posted: 17 Oct 2017, 2:33:20 UTC - in response to Message 1895784.  

Stephen, two steps forward and one step back for you seems the norm. Keep the faith!!


. . The way I feel it is more like 2 steps forward and 3 steps back :(

Stephen

:(
ID: 1895787 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1895798 - Posted: 17 Oct 2017, 3:06:52 UTC

I'm going to post this Inconclusive for a couple of reasons. To begin with, it's the first I've seen of zi3xs3, which is apparently Petri's latest version of the Special App. It actually appears to report signals that, except for the very last one, seem to match SoG pretty closely, albeit with Spikes and Autocorrs being in a different order in several places.

It's that different order that I assume resulted in the last reported signal before the overflow being a Spike in SoG and an Autocorr in zi3xs3.

There is also a disagreement about Best Autocorr, but the odd thing here (at least to me) is that while zi3xs3 reports a Best Autocorr that matches one of the reported signals, the Best Autocorr reported by SoG is not found among the 3 reported signals. Hmmm.....

Anyway, one of my Windows hosts is assigned the tiebreaker, which should run with the same r3584 SoG app as the first host.

Workunit 2711811794 (blc05_2bit_guppi_57903_63524_HIP22812_0048.19015.409.17.26.200.vlar)
Task 6094658928 (S=27, A=3, P=0, T=0, G=0, BG=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 6095383686 (S=26, A=4, P=0, T=0, G=0, BG=0) x41p_zi3xs3, Cuda 9.00 special
ID: 1895798 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1895800 - Posted: 17 Oct 2017, 3:14:14 UTC - in response to Message 1895783.  

When I went through the procedure silly me used the command "sudo apt-get remove --purge nvidia" to which the reponse was nothing to remove ... I should have twigged right then ...
So, it never occurred to you that you might be using the Wrong cmdline? The line is in this thread in numerous locations, and also at ASK Ubuntu;
sudo apt-get remove --purge nvidia*
see that * it has to be there. You don't mention running autoremove either. The Repository driver will leave items behind that Need to be removed before installing the driver from nVidia.
sudo apt-get autoremove
I'd go back and do it again, correctly this time. Once you have your xorg.conf configured I wouldn't let an installer touch it. You do have a copy stashed somewhere? I have a few copies, one xorg.conf in Documents easy to paste back if needed. You have Always needed to run sudo reboot after installing the driver.

The App I have is configured to run on a cc 3.5 GPU, I already know how the normal App runs on others. You will get Many Inconclusive Overflows, and display at least One Invalid Overflow constantly on the other machines. That's why I'm only interested in seeing how this App runs on 3.5 Keplers.
ID: 1895800 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895803 - Posted: 17 Oct 2017, 3:31:50 UTC - in response to Message 1895800.  

When I went through the procedure silly me used the command "sudo apt-get remove --purge nvidia" to which the reponse was nothing to remove ... I should have twigged right then ...
So, it never occurred to you that you might be using the Wrong cmdline? The line is in this thread in numerous locations, and also at ASK Ubuntu;
sudo apt-get remove --purge nvidia*
see that * it has to be there. You don't mention running autoremove either. The Repository driver will leave items behind that Need to be removed before installing the driver from nVidia.
sudo apt-get autoremove


. . It did eventually. I do have the '*' in my notes but mistook it for a reference to a footnote, I have now amended my notes. And yes I always run autoremove, that step IS in my notes. I rely heavily on them in lieu of actually having any memory.

I'd go back and do it again, correctly this time. Once you have your xorg.conf configured I wouldn't let an installer touch it. You do have a copy stashed somewhere? I have a few copies, one xorg.conf in Documents easy to paste back if needed. You have Always needed to run sudo reboot after installing the driver.


. . Really? I need to do it again? :( Does that mean saying no to using xconfig and hoping the offer of registering with DKMS will happen? Also now you have me worried, I have sudo reboot in my notes but did I absentmindedly omit the sudo??

The App I have is configured to run on a cc 3.5 GPU, I already know how the normal App runs on others. You will get Many Inconclusive Overflows, and display at least One Invalid Overflow constantly on the other machines. That's why I'm only interested in seeing how this App runs on 3.5 Keplers.


. . OK so I will change the card now then?

Stephen

??
ID: 1895803 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1895821 - Posted: 17 Oct 2017, 6:25:34 UTC - in response to Message 1895803.  
Last modified: 17 Oct 2017, 6:25:46 UTC

Check your PMs.
ID: 1895821 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1895826 - Posted: 17 Oct 2017, 8:36:52 UTC - in response to Message 1895821.  

. . Hey TBar,

. . Results so far. This is with x41p_zi3v.

. . 1 x Arecibo (probably low AR) @ 28.1 mins
. . 9 x " @ 12.1 to 15.7 mins . . - . .so far 5 have validated AOK
. . 1 x probable halfling @ 6.8 mins.

. . Not bad for a humble little GPU with only 2 CUs :) And I am running it with BS on so I can crunch on the second CPU core.

. . But it does have 384 cuda cores, 2GB ram and a 1GHz clock :)

Stephen
ID: 1895826 · Report as offensive
Previous · 1 . . . 61 · 62 · 63 · 64 · 65 · 66 · 67 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.