Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 83 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1884569 - Posted: 17 Aug 2017, 21:54:47 UTC - in response to Message 1884559.  
Last modified: 17 Aug 2017, 21:58:18 UTC

....24 hours later....jumping tasks page is there on mageia 6 !
I don't understand why it was working fine for a whole day and now the bug is there !
anyway, this is not a serious bug for me, I use to display only running tasks.

Make sure driver is set to "nvidia'" in /etc/X11/xorg.conf.
I agree mageia is not as user friendly as ubuntu...but it's a french distribution ;)
Yeah, I had tried to change the xorg.conf first, but gksu nautilus doesn't work in Mageia. Then I tried the compiz change. I was able to edit xorg.conf while booted into Ubuntu, but now Mageia is stuck in a reboot loop, "The system has to be rebooted due to a display driver change". Apparently something else needs to be changed, and it's kinda difficult having to make edits while booted to Ubuntu. Since you're now getting the Page Jump with BOINC 7.8 I suppose I really don't need to get Mageia running, it would be nice to get it running though. I've tried using my Ubuntu xorg.conf and editing the one in Mageia, still stuck in the Reboot loop though.
ID: 1884569 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1884680 - Posted: 18 Aug 2017, 9:53:06 UTC - in response to Message 1884569.  

....24 hours later....jumping tasks page is there on mageia 6 !
I don't understand why it was working fine for a whole day and now the bug is there !
anyway, this is not a serious bug for me, I use to display only running tasks.

Make sure driver is set to "nvidia'" in /etc/X11/xorg.conf.
I agree mageia is not as user friendly as ubuntu...but it's a french distribution ;)
Yeah, I had tried to change the xorg.conf first, but gksu nautilus doesn't work in Mageia. Then I tried the compiz change. I was able to edit xorg.conf while booted into Ubuntu, but now Mageia is stuck in a reboot loop, "The system has to be rebooted due to a display driver change". Apparently something else needs to be changed, and it's kinda difficult having to make edits while booted to Ubuntu. Since you're now getting the Page Jump with BOINC 7.8 I suppose I really don't need to get Mageia running, it would be nice to get it running though. I've tried using my Ubuntu xorg.conf and editing the one in Mageia, still stuck in the Reboot loop though.


Try to install the driver from nvidia.
Except this little bug, 8.0 works fine for me (on mageia and ubuntu 16.04)
Just missing the 'Allows caching of up to 3k tasks' feature ! Do you have a patch ?
ID: 1884680 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1884688 - Posted: 18 Aug 2017, 11:01:46 UTC - in response to Message 1884680.  
Last modified: 18 Aug 2017, 11:02:55 UTC

Except this little bug, 8.0 works fine for me (on mageia and ubuntu 16.04)
Just missing the 'Allows caching of up to 3k tasks' feature ! Do you have a patch ?
From the latest post at the BOINC board, the latest version 7.8.1 run in the Repository version has the same Bug with Debian as all the others newer than 7.2.47. Since the Bug has been infesting BOINC for around 3 Years, I don't expect a fix anytime soon. I don't plan on making any changes to the newer version as my 7.8 may even make it to Beta. If it were me, I'd drop kick the current 'Definitely Unstable" for 3 Years version 7.4.22 and place my 7.8 in the "Development version" slot as 7.8 is much better seeing as how the Counters count, and the Active tasks are actually active. My 7.8 also has mostly Static libraries meaning it only has One outstanding Dependency in the tested systems. At some point someone has to give 7.4.22 the boot, it should have been done a couple of years ago. I don't see 7.2.47 going anywhere except in the All In One Special package, so, if you really want it's feature, you're going to have to use 7.2.47.

I think I'll just give up on Mageia, it's so much different than what I'm use to it's not worth running seeing as how the Special App works in all versions of Linux.
ID: 1884688 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1884710 - Posted: 18 Aug 2017, 12:52:34 UTC - in response to Message 1884688.  

Except this little bug, 8.0 works fine for me (on mageia and ubuntu 16.04)
Just missing the 'Allows caching of up to 3k tasks' feature ! Do you have a patch ?
From the latest post at the BOINC board, the latest version 7.8.1 run in the Repository version has the same Bug with Debian as all the others newer than 7.2.47. Since the Bug has been infesting BOINC for around 3 Years, I don't expect a fix anytime soon.


Maybe they are not just aware of this bug ? if you didn't tell me about it, I would never see it (especially as it happens only after 24 hours for me !).
Did you send a bug report ?


I don't plan on making any changes to the newer version as my 7.8 may even make it to Beta. If it were me, I'd drop kick the current 'Definitely Unstable" for 3 Years version 7.4.22 and place my 7.8 in the "Development version" slot as 7.8 is much better seeing as how the Counters count, and the Active tasks are actually active. My 7.8 also has mostly Static libraries meaning it only has One outstanding Dependency in the tested systems. At some point someone has to give 7.4.22 the boot, it should have been done a couple of years ago. I don't see 7.2.47 going anywhere except in the All In One Special package, so, if you really want it's feature, you're going to have to use 7.2.47.


ok I understand. Maybe you can send your binary to the boinc team so they can at least remove the 'old' 7.4.22 dev package ?


I think I'll just give up on Mageia, it's so much different than what I'm use to it's not worth running seeing as how the Special App works in all versions of Linux.


yes...if you need some test on Mageia, tell me ;)
ID: 1884710 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1884718 - Posted: 18 Aug 2017, 13:50:19 UTC - in response to Message 1884710.  

Maybe they are not just aware of this bug ? if you didn't tell me about it, I would never see it (especially as it happens only after 24 hours for me !).
Did you send a bug report ?

You can start here and work through some of the other pages, https://www.mail-archive.com/boinc_dev@ssl.berkeley.edu/msg07978.html
I'd say they are pretty aware of the problems. In Ubuntu, the Tasks Page will jump Multiple times per hour. I'd say if they aren't aware of that, then it's pretty hopeless.
ID: 1884718 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1884719 - Posted: 18 Aug 2017, 14:02:25 UTC - in response to Message 1884718.  

I have never seen the page jump to the bottom, but them I keep mine sorted by remaining time at the top - Show All, smallest number on top.
I do however see the window scroll to the left sometimes.
This is both Ubuntu 14 and Mint 18.2
ID: 1884719 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1884721 - Posted: 18 Aug 2017, 14:07:16 UTC - in response to Message 1884719.  
Last modified: 18 Aug 2017, 14:30:13 UTC

Which versions of BOINC are you using in each OS? The Jumping problem started around 7.4.22. You can read what other problems were being talked about in my previous link.

BTW, if you look back at the Posts you will notice I've repeated numerous times that BOINC 7.2.42 is the Last one that Didn't have the Jumping problem. Which is Why I built 7.2.47...with improvements.
ID: 1884721 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1884724 - Posted: 18 Aug 2017, 14:27:13 UTC - in response to Message 1884721.  
Last modified: 18 Aug 2017, 14:34:19 UTC

7.2.42 in Ubuntu
7.6.33 in Mint 18
Both are from repository

Possibly something to do with the column being graphical (progress) that is making it do it on you?

EDIT: I wonder ... I usually am not displaying the status bar (scrolled right) so bugs (I think you mentioned) with progress bar jumping, I would likely miss. And that may cause the listing flip???
ID: 1884724 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1884725 - Posted: 18 Aug 2017, 14:34:33 UTC - in response to Message 1884724.  
Last modified: 18 Aug 2017, 14:38:00 UTC

I'd say it's because you haven't read the thread. You are using 7.2.42 Which DOESN'T HAVE THE PROBLEM.
You Appear to be running CPUs with 7.6.31 on a new system. Try running 7.6.31 on a GPU system where updates happen every 5 minutes. See how that works for you.
ID: 1884725 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1884726 - Posted: 18 Aug 2017, 14:57:02 UTC - in response to Message 1884725.  
Last modified: 18 Aug 2017, 14:58:09 UTC

I know 7.2.42 was not one you claim to be buggy. I meant that I don't see any difference in behaviour between the two - just the left scroll thing.

I had a 750Ti in the i7-960 for a couple of days. And a different install running on 24 threads of X5680 and 2x750Ti. Didn't see anything different.
ID: 1884726 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1884729 - Posted: 18 Aug 2017, 15:13:38 UTC - in response to Message 1884726.  
Last modified: 18 Aug 2017, 15:22:50 UTC

You are the First person to say you don't have the jumping problem with 7.6, a few other have said they Do.
I have tried 7.6.33 on three different GPU systems and they all have the Jumping problem, just as others have confirmed.
A couple others have also confirmed the problem with the latest builds in the BOINC Forum here,
I've been trying 7.8.1 on Debian thanks to Gianfranco.
The bad news is it has the jumping tasks bug and the event log loses the time format.

Interesting you don't have the problem that has been confirmed on the BOINC Developer's Forum.
ID: 1884729 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1884798 - Posted: 18 Aug 2017, 20:27:42 UTC

The Downloads page has a similar bug.

Sort rows for your preferred column. And BANG the next update trashes the sort order.

I do not care, but I know it is there.

The BUG.

--
petri33
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1884798 · Report as offensive
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1885254 - Posted: 21 Aug 2017, 13:06:34 UTC

Hi Guys,

I switched over to the new X41p_Zi3v Cuda app several days ago. I seem to be getting more invalids than I am used to. Most of them appear to be -9 overflows, but I don't know if it is just luck of the draw or something not right. Here is my Invalid Page, if someone could take a look and see if there is something I need to correct, I would appreciate it.

Thanks.
Bruce
ID: 1885254 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1885337 - Posted: 21 Aug 2017, 20:05:46 UTC - in response to Message 1885254.  

Try pfb = 32, or try taking it out. Those are definitely high numbers.
What is your command line?
Clocking?
ID: 1885337 · Report as offensive
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1885342 - Posted: 21 Aug 2017, 20:27:44 UTC - in response to Message 1885337.  

Try pfb = 32, or try taking it out. Those are definitely high numbers.
What is your command line?
Clocking?


I don't overclock my cards, it's just stock timings.
I tried pfb 32 and it seems to be worse. The command line has only the pfb setting.
I'll try backing it off to pfb 8, or just take it out like you suggest.
Bruce
ID: 1885342 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1885347 - Posted: 21 Aug 2017, 20:51:08 UTC - in response to Message 1885342.  

Try pfb = 32, or try taking it out. Those are definitely high numbers.
What is your command line?
Clocking?


I don't overclock my cards, it's just stock timings.
I tried pfb 32 and it seems to be worse. The command line has only the pfb setting.
I'll try backing it off to pfb 8, or just take it out like you suggest.


Set GPU fan to 100% and see if that helps.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1885347 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1885351 - Posted: 21 Aug 2017, 20:57:16 UTC
Last modified: 21 Aug 2017, 21:17:58 UTC

The CUDA 6.5 Special App is for the older Kepler CC 3.5 GPUs that might not work well with CUDA 8.0.
Device 1: GeForce GTX TITAN Z, 6081 MiB, regsPerBlock 65536 computeCap 3.5

I'd try CUDA 6.5.

BTW, I finally received a reply from nVidia about the Mac Driver Bug report;
*- Status changed from "Open - in progress" to "Closed - not an NV bug"*

That took a little over 3 months to receive that response.
In the Benchmarks, the latest NV Mac CUDA Pascal driver 8.0.90 is about Twice as slow as the Last Maxwell CUDA Driver 8.0.71 running zi3v.
ID: 1885351 · Report as offensive
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1885365 - Posted: 21 Aug 2017, 22:15:28 UTC

@Petri
I always run the fans at 100%.

@TBar
I changed over to the 6.5 Cuda. I thought my cards were capable of the 8.0 Cuda, guess not.
Hopefully this will get rid of all the invalids. Updated the video driver to 375.82.
Messed up to many gpu tasks, may have to wait until tomorrow to get new ones.
Bruce
ID: 1885365 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1885372 - Posted: 21 Aug 2017, 22:46:21 UTC - in response to Message 1885365.  
Last modified: 21 Aug 2017, 23:43:31 UTC

I think the general rule is you should use the CUDA version that was current when your GPU was released. In usual cases you can use a higher version of CUDA, in other cases you can't. On the Mac side, all the Apps that worked fine up to CUDA 8.0.71 suddenly don't work correctly with CUDA 8.0.81 and above. That's alright, nVidia just told us earlier today it's Not an NV Bug, so I guess everything is OK.

It seems you have a number of GPU tasks, there are just Ghosts. That happens when you use a different Version number or Plan Class in your app_info.xml when you have Existing tasks assigned to one particular set of numbers. If you have Existing tasks, you must keep the numbers in your app_info.xml the same as the ones in your client_state.xml. You can recover Ghosts, 20 at a time, using a couple of different procedures. When you get a large number of Ghosts, it becomes tedious.

EDIT = BTW, if for some reason the CUDA Driver insists on finding the CUDA 6.5 Libraries instead of using the CUDA 6.0 Libraries listed in the app_info.xml, make links to the CUDA 6.0 Libraries and rename the Links libcudart.so.6.5 & libcufft.so.6.5. It seems on some systems the Libraries listed in the app_info.xml are being ignored and it demands to find the 6.5 Libraries. Making links, or duplicating the Libraries, and renaming them to 6.5 should fix that.
ID: 1885372 · Report as offensive
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1885407 - Posted: 22 Aug 2017, 3:28:51 UTC - in response to Message 1885372.  

Hi TBar

I don't think I'm going to worry about the ghosts until I get the invalid problem fixed. Quite a few of them should be showing as aborts and compute errors, don't know why they turned ghost instead.

The Cuda 6.5 app runs worse than the Cuda 8.0 app. All compute errors.
Tried rolling video driver back to 375.66.
I tried making links and renaming them, didn't help.
Tried downloading fresh copies of libcudart.so.6.0 and libcufft.so.6.0, also download a second copy of each and renamed them to 6.5.0.
Nothing seems to help.

Right now cuda is completely hosed here.

Even tried changing app_info to the renamed files.

Anything else I could try???

Thanks
Bruce
ID: 1885407 · Report as offensive
Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.