Message boards :
Number crunching :
Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation
Previous · 1 . . . 61 · 62 · 63 · 64 · 65 · 66 · 67 . . . 83 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Are we still in pre-production testing for the CUDA 9.0 apps? Or are they usable in production now? . . I think it is still Caveat Emptor ........ Stephen :) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It depends on how you decipher this post, https://setiathome.berkeley.edu/forum_thread.php?id=80636&postid=1895161#1895161 To me, that sounds like another change, and I'm still testing. The clincher is, there aren't any CUDA 9 Links at C.A. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
My confusion stems from the stderr.txt output I see in tasks processed by xs2 saying application by Petri33 and released to the public by TBar. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
The impression I got from Petri's original post was that any "pre-production testing" was something he expected each individual user to simply perform with offline runs on their own configurations. Then, it would be a judgment call for each user as to when, or whether, the switch-over to production could commence. So, basically, you have to make your own call as to how much offline testing is sufficient before going live. Not an ideal approach, but it is what it is. :^)Are we still in pre-production testing for the CUDA 9.0 apps? Or are they usable in production now? |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . OK TBar, Things to do: . . 1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? . . 2) Upgrade BOINC to 7.8.2 Q - since I haven't had to upgrade BOINC in the 6 months I have been using Linux is there anything I particularly need to DO/NOT Do/Be aware of? {note- I should probably add CA to my fave's list} . . 3) Suspend all tasks and swap out GPU . . 4) Resume one task and see if it runs using 3v. If not suspend (clear results files) then go back to earlier version of Special sauce, or maybe to a CUDA65 version. . . 5) Once I have WUs crunching successfully with the GT730 then run a few through and gather baseline data. . . 6) Install the test version (CUDA90?) and compare results. . . 7) Tell TBar what the results are.! . . Have I missed anything ??? . . If nothing else it will be a good opportunity to give the rig a spring clean :) Stephen :) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
My confusion stems from the stderr.txt output I see in tasks processed by xs2 saying application by Petri33 and released to the public by TBar. If the indicator is "Why am I seeing a huge amount of resends now that the new app has been 'released'?" I would say NO it is not ready yet. I haven't looked at why I seeing so many resends, but I have a feeling I know why ... and they are likely validating against each other ... |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . UPDATE: . . My confidence in Murphy's Law is reinforced. Things to do: . . 1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ... :( . . 2) Upgrade BOINC to 7.8.2 Q - since I haven't had to upgrade BOINC in the 6 months I have been using Linux is there anything I particularly need to DO/NOT Do/Be aware of? {note- I should probably add CA to my fave's list} OK, this is going to be a disaster I feel because this rig is currently running the repository version of BOINC and I do not know how to manually install over the top of that. So I have to ask, is this part absolutely necessary? . . 3) Suspend all tasks and swap out GPU . . 4) Resume one task and see if it runs using 3v. If not suspend (clear results files) then go back to earlier version of Special sauce, or maybe to a CUDA65 version. . . 5) Once I have WUs crunching successfully with the GT730 then run a few through and gather baseline data. . . 6) Install the test version (CUDA90?) and compare results. . . 7) Tell TBar what the results are.! . . Have I missed anything ??? . . If nothing else it will be a good opportunity to give the rig a spring clean :) Stephen :) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
. . UPDATE: I just added the graphics driver ppa and updated through the Package Manager. Pretty easy. I don't know, but I wouldn't think Linux downgrades a driver when it updates. At least it always asks whether you want a package uninstalled or installed during an update. And yes, you would need the 384.90 level Nvidia drivers to support the new Petri/Tbar CUDA 9.0 app. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ...I don't have to reinstall the driver. Make sure you have DKMS installed, sudo apt-get install dkms I did give you a link to the driver. Download the driver, move it to your home folder, and set the execute bit. Drop into the console, login, sudo stop lightdm, remove the repository driver and run autoremove. When you install the nVidia driver choose to register the kernel module and it will be automatically reinstalled during each update. Works for me. If you can't update the version of BOINC, then Don't. I still don't see a Kepler card in that machine yet. Since I last posted, I have done the following, 1) Replaced a NV card with a ATI card so I could boot to El Capitan 2) Booted to EL Capitan where XCode works with Petri's Code 3) Installed the CUDA 9 ToolKit 4) Tried compiling different versions of boinc-master until I found one that worked. Apparently this Screen Saver Fix Broke the latest boinc-master in OSX 5) Tried for hours to get the Static Libraries to work in OSX...it doesn't work 6) Settled for the last CUDA 9 App that doesn't have Static Libraries, zi3x, which apparently is not any better than zi3v. 7) Swapped cards again, booted back to the OS that supports Pascal, and tested the new zi3x App. ...and I still don't see a Kepler card in Stephen's machine. But hey, I have a new CUDA 9 Mac App that seems to be working as usual. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
1) Upgrade video drivers to 384.xx. Q - are they in the Ubuntu repository or do I go to nvidia? So now, since this version of the drivers is not in the repository, I will have to re-install the video drivers every time Linux does an update ...I don't have to reinstall the driver. Make sure you have DKMS installed, sudo apt-get install dkms . . Yep, because unlike you I am scared of Linux because it keeps biting me. I tried upgrading the video drivers on the other rig because having BOINC under the home directory I felt it would be more stable. But no, it wigged out during the process and I had to "login", so I did as myself, turns out I needed to login as root, so that went down the drain. I am in the process of trying to get that unit working again. And since the update to release 97 that trashed the loader forcing me to roll it back to release 96 now it keeps losing the loader and then making the flashdrive disappear. I have to keep moving it from port to port to get to see it again so I can roll it back whenever I have to boot. AAAhhhhhhrrfghhgh! Linux .... :( Stephen |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . may i ask. What does ppa mean? Stephen ?? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
It stands for personal package archives in Linux. I found the information about the official graphics ppa here. ubuntu-official-ppa-graphics Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I don't have to reinstall the driver. Make sure you have DKMS installed, sudo apt-get install dkms . . That was like having every tooth pulled without anaesthetic ... . . OK ... here's the thing ... . . I tried to install DKMS as you said, it was already there. But ... . . When I went through the procedure silly me used the command "sudo apt-get remove --purge nvidia" to which the reponse was nothing to remove ... I should have twigged right then ... . . When I ran the install THIS TIME it offered the option of registering with DKMS. None of the previous driver installs had done that. So I selected yes and it began compiling the DKMS kernel. But that failed and said I needed other things such as pkg-config. When I tried to install that ... it was already there :( . . I couldn't boot because I was now in limbo and doing the login screen hop! . . I tried re-installing the repository drivers but it kept saying they were already the latest, second time around it twigged. So I reran "sudo apt-get remove --purge nvidia-375" and what do you know, that did something. . . I re-ran the install and this time it offered, what it had always offered on each successful past attempt, to user xconfig to remember the video configuration and as before I said yes. Now there was as always no offer to register with DKMS. I am left wondering if I had said no to the xconfig question would I have been prompted to register with DKMS?? None the less this time the install worked and I was able to reboot OK. Strangely though the reboot command no longer works as it keeps telling me I need to be root, which was not previously the case. So a three finger salute it is then. . . System reboots and runs with the new drivers but card is getting hot, so I try to manually run the fan control script which crashes and burns. OK, close everything, run coolbits and reboot. Fan control still crashes and burns. OK plan B, now running the xserver interface to run the fans. . . And I have no intention of trying to update to 7.8.2. even if it was remotely possible. . . Right now I would like to install 3s-65 and confirm it works with the 1050ti before too many other things change. Then that being successful I will swap out the cards or do you think I should swap out the cards first? Stephen :( ?? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Stephen, two steps forward and one step back for you seems the norm. Keep the faith!! Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Stephen, two steps forward and one step back for you seems the norm. Keep the faith!! . . The way I feel it is more like 2 steps forward and 3 steps back :( Stephen :( |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I'm going to post this Inconclusive for a couple of reasons. To begin with, it's the first I've seen of zi3xs3, which is apparently Petri's latest version of the Special App. It actually appears to report signals that, except for the very last one, seem to match SoG pretty closely, albeit with Spikes and Autocorrs being in a different order in several places. It's that different order that I assume resulted in the last reported signal before the overflow being a Spike in SoG and an Autocorr in zi3xs3. There is also a disagreement about Best Autocorr, but the odd thing here (at least to me) is that while zi3xs3 reports a Best Autocorr that matches one of the reported signals, the Best Autocorr reported by SoG is not found among the 3 reported signals. Hmmm..... Anyway, one of my Windows hosts is assigned the tiebreaker, which should run with the same r3584 SoG app as the first host. Workunit 2711811794 (blc05_2bit_guppi_57903_63524_HIP22812_0048.19015.409.17.26.200.vlar) Task 6094658928 (S=27, A=3, P=0, T=0, G=0, BG=0) v8.22 (opencl_nvidia_SoG) windows_intelx86 Task 6095383686 (S=26, A=4, P=0, T=0, G=0, BG=0) x41p_zi3xs3, Cuda 9.00 special |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
When I went through the procedure silly me used the command "sudo apt-get remove --purge nvidia" to which the reponse was nothing to remove ... I should have twigged right then ...So, it never occurred to you that you might be using the Wrong cmdline? The line is in this thread in numerous locations, and also at ASK Ubuntu; sudo apt-get remove --purge nvidia* see that * it has to be there. You don't mention running autoremove either. The Repository driver will leave items behind that Need to be removed before installing the driver from nVidia. sudo apt-get autoremove I'd go back and do it again, correctly this time. Once you have your xorg.conf configured I wouldn't let an installer touch it. You do have a copy stashed somewhere? I have a few copies, one xorg.conf in Documents easy to paste back if needed. You have Always needed to run sudo reboot after installing the driver. The App I have is configured to run on a cc 3.5 GPU, I already know how the normal App runs on others. You will get Many Inconclusive Overflows, and display at least One Invalid Overflow constantly on the other machines. That's why I'm only interested in seeing how this App runs on 3.5 Keplers. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
When I went through the procedure silly me used the command "sudo apt-get remove --purge nvidia" to which the reponse was nothing to remove ... I should have twigged right then ...So, it never occurred to you that you might be using the Wrong cmdline? The line is in this thread in numerous locations, and also at ASK Ubuntu; . . It did eventually. I do have the '*' in my notes but mistook it for a reference to a footnote, I have now amended my notes. And yes I always run autoremove, that step IS in my notes. I rely heavily on them in lieu of actually having any memory. I'd go back and do it again, correctly this time. Once you have your xorg.conf configured I wouldn't let an installer touch it. You do have a copy stashed somewhere? I have a few copies, one xorg.conf in Documents easy to paste back if needed. You have Always needed to run sudo reboot after installing the driver. . . Really? I need to do it again? :( Does that mean saying no to using xconfig and hoping the offer of registering with DKMS will happen? Also now you have me worried, I have sudo reboot in my notes but did I absentmindedly omit the sudo?? The App I have is configured to run on a cc 3.5 GPU, I already know how the normal App runs on others. You will get Many Inconclusive Overflows, and display at least One Invalid Overflow constantly on the other machines. That's why I'm only interested in seeing how this App runs on 3.5 Keplers. . . OK so I will change the card now then? Stephen ?? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Check your PMs. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Hey TBar, . . Results so far. This is with x41p_zi3v. . . 1 x Arecibo (probably low AR) @ 28.1 mins . . 9 x " @ 12.1 to 15.7 mins . . - . .so far 5 have validated AOK . . 1 x probable halfling @ 6.8 mins. . . Not bad for a humble little GPU with only 2 CUs :) And I am running it with BS on so I can crunch on the second CPU core. . . But it does have 384 cuda cores, 2GB ram and a 1GHz clock :) Stephen |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.