Low RAC with GTX 1080

Message boards : Number crunching : Low RAC with GTX 1080
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1810143 - Posted: 18 Aug 2016, 14:26:48 UTC

I have a computer with a GTX 1080 running on Fedora 24. I'm lucky, yes ;)
But the RAC is still 16k after a month....the same value as another computer running 2 GTX 950...so I'm puzzled...is GTX 1080 same as two GTX 950 ???
I'm running anonymous MBv8_8.04r3306 and setiathome_x41z on both host.

CPU is i7-6700K, 5 cpu are running, the 3 remaining are feeding the GTX 1080 (4x setiathome_x41z)

Host information :
http://setiathome.berkeley.edu/show_host_detail.php?hostid=8043590

Any hint ? Thanks

PS: Is there any new MBv8_8.04 for Linux (current is r3306, windows is r3500) or
setiathome_x41z update ?
ID: 1810143 · Report as offensive
The_Matrix
Volunteer tester

Send message
Joined: 17 Nov 03
Posts: 414
Credit: 5,827,850
RAC: 0
Germany
Message 1810144 - Posted: 18 Aug 2016, 14:38:20 UTC

i cut my cpu usage with the percentage (75%) in the boinc client, how did u that ?

did u tune ?

-sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64

for high end cards in the *opencl_NV.txt file ? or using cuda ?
ID: 1810144 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1810149 - Posted: 18 Aug 2016, 14:45:30 UTC

Looks like you're using CUDA which is much slower by default. To compete with OpenCL you'll need to run multiple GPU tasks concurrently and probably run the GUPPI rescheduler, too.

After farting around with my 980Ti for a while I gave up and just switched back to stock OpenCL and my rack is better than ever.
ID: 1810149 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1810150 - Posted: 18 Aug 2016, 14:51:51 UTC - in response to Message 1810144.  

i cut my cpu usage with the percentage (75%) in the boinc client, how did u that ?



in global_prefs_override.xml
<max_ncpus_pct>70.000000</max_ncpus_pct>



did u tune ?

-sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64

for high end cards in the *opencl_NV.txt file ? or using cuda ?


I have installed :
setiathome_x41zi_x86_64-pc-linux-gnu_cuda60.7z
and
MBv8_8.04r3306_sse4.2_linux64_CPU.7z
in the BOINC/projects/setiathome.berkeley.edu/ directory.

There is no *.txt file in those package. In which file should I add this line ?
ID: 1810150 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1810152 - Posted: 18 Aug 2016, 15:07:05 UTC - in response to Message 1810149.  

Looks like you're using CUDA which is much slower by default. To compete with OpenCL you'll need to run multiple GPU tasks concurrently and probably run the GUPPI rescheduler, too.



Time to congratulate you for your nice graph :)
Is there any difference between Windows/Linux ? I suppose you mix them.


After farting around with my 980Ti for a while I gave up and just switched back to stock OpenCL and my rack is better than ever.


I have big increase using anonymous over stock on GTX950/750 hosts. It may be the answer....
ID: 1810152 · Report as offensive
The_Matrix
Volunteer tester

Send message
Joined: 17 Nov 03
Posts: 414
Credit: 5,827,850
RAC: 0
Germany
Message 1810153 - Posted: 18 Aug 2016, 15:17:39 UTC
Last modified: 18 Aug 2016, 15:19:00 UTC

mb_cmdline-8.10-opencl_ati5_SoG.txt

thatÅ› the name of the opencl ATI/AMD file.

Your NV file must almost the same...

/var/lib/boinc-client/projects/setiathome.berkeley.edu

With root-privileges and "nautilus" u can edit the *opencl_NV.txt file

There too less options for cuda workunits...
ID: 1810153 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1810164 - Posted: 18 Aug 2016, 15:50:25 UTC - in response to Message 1810153.  

mb_cmdline-8.10-opencl_ati5_SoG.txt

thatÅ› the name of the opencl ATI/AMD file.

Your NV file must almost the same...

Take care when making assumptions like that: the NV file name has changed within the last 48 hours, and the SoG build is no longer deemed approprate for ATi/AMD cards (it didn't improve performance).
ID: 1810164 · Report as offensive
The_Matrix
Volunteer tester

Send message
Joined: 17 Nov 03
Posts: 414
Credit: 5,827,850
RAC: 0
Germany
Message 1810170 - Posted: 18 Aug 2016, 16:15:33 UTC

@Richard, next time i will hold me back to write, can´t check that because lag of nvidia gpus in the another host.

Thanks to the provider who gave me this files by download, and it was downloaded today, why are they there ?(not questioning, just kiding).
ID: 1810170 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1810172 - Posted: 18 Aug 2016, 16:29:31 UTC

Passing thought - perhaps you are running too many concurrent tasks on the 1080. With them being so new nobody is sure of how many tasks is the optimum.
Questions to consider:
What is the run time of each task, and how does that compare with the run time of similar tasks when running only one task?
Did you just jump in at four concurrent, or did you build up to that watching the run-time for individual tasks?

And - if the Linux offerings are anything like the Windoze offerings then the stock CUDAxx offering will be substantially slower than the "stock" SoH for the majority of task.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1810172 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1810183 - Posted: 18 Aug 2016, 17:59:03 UTC - in response to Message 1810153.  

mb_cmdline-8.10-opencl_ati5_SoG.txt

thatÅ› the name of the opencl ATI/AMD file.

Your NV file must almost the same...

/var/lib/boinc-client/projects/setiathome.berkeley.edu

With root-privileges and "nautilus" u can edit the *opencl_NV.txt file

There too less options for cuda workunits...


This file should be available when running stock. I have them on my
seti@homeBeta account, but not in seti@home project running anonymous

I will switch to stock then.
ID: 1810183 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1810185 - Posted: 18 Aug 2016, 18:02:14 UTC - in response to Message 1810172.  

Passing thought - perhaps you are running too many concurrent tasks on the 1080. With them being so new nobody is sure of how many tasks is the optimum.
Questions to consider:
What is the run time of each task, and how does that compare with the run time of similar tasks when running only one task?
Did you just jump in at four concurrent, or did you build up to that watching the run-time for individual tasks?


I've tried default (one GPU task) but it was worse.




And - if the Linux offerings are anything like the Windoze offerings then the stock CUDAxx offering will be substantially slower than the "stock" SoH for the majority of task.


I use anonymous because when v8 was launched, there was no nvidia client (only ati).
ID: 1810185 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1810188 - Posted: 18 Aug 2016, 18:29:13 UTC

A quick at the times for the 1080 shows some very long run-times, this suggests that it is overloaded, or has the wrong drivers.
You say one task at a time was slower than four at a time - this is not possible - I think what you are looking at is the rate at which tasks are being returned, not the time each task is taking to run.
If you try and run a high end nVidia card with an ATI/AMD set of commands it will almost certainly have a very poor performance - CUDA apps and the AMD apps are very different.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1810188 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1810203 - Posted: 18 Aug 2016, 20:53:16 UTC - in response to Message 1810188.  

A quick at the times for the 1080 shows some very long run-times, this suggests that it is overloaded, or has the wrong drivers.


I use the latest nvidia drivers 367.35


You say one task at a time was slower than four at a time - this is not possible - I think what you are looking at is the rate at which tasks are being returned, not the time each task is taking to run.


RAC was lower with one at a time rather than four at a time.
With four GPU tasks, I've free CPU core until nvidia-smi display show 90-95%


If you try and run a high end nVidia card with an ATI/AMD set of commands it will almost certainly have a very poor performance - CUDA apps and the AMD apps are very different.


I have no ATI commands, I've just unzipped the setiathome_x41z and MBv8_8.04r3306 package from lunatics.

Anyway, I will revert to stock....[/quote]
ID: 1810203 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1810296 - Posted: 19 Aug 2016, 4:57:21 UTC

Running for maximum GPU use may not be the optimum for getting tasks done in a reasonable time, indeed it rarely is. Watch the run times, as you run more tasks you will find they increase slowly, then suddenly leap up. The point you need to be at is just before the big leap - I looked at you run times, and a number were getting towards two hours - compared with the 30-40 minutes I'm seeing on my GTX1080 running 2 (SoG) or about 50 minutes running CUDA.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1810296 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1810299 - Posted: 19 Aug 2016, 5:09:58 UTC - in response to Message 1810296.  
Last modified: 19 Aug 2016, 5:51:46 UTC

Guppies on my GTX1080 takes about 13-14min each (latest SoG r3500, running 2 in parallel ). On my i7-2600k and I have limited CPU usage to 50% to be able to feed the monster properly and yet have a reasonably responsive system to work on at same time.

I am using command line switches as suggested in readme for high end GPUs:
-sbs 384 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64

Average TDP with Guppies is around 58% (over 70% with non-Guppies), average GPU load around 97%.
ID: 1810299 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1810300 - Posted: 19 Aug 2016, 5:13:14 UTC - in response to Message 1810299.  

M_M,

50% of which? CPUs or CPU time?

Zalster
ID: 1810300 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1810302 - Posted: 19 Aug 2016, 5:31:35 UTC - in response to Message 1810300.  
Last modified: 19 Aug 2016, 5:43:20 UTC

i7-2600k is a 4core/8threads. I have experimented a bit and found out this as a optimal setting on my config for maximum RAC. Even though CPU load is shown as 100%, system (Win10) is reasonably responsive for normal work (surfing, office work etc).



If I set more then 50% CPU with 2 SoG in parallel, system is a bit less responsive and GPU WU times increases (visible also by lower GPU TDP, which is a better indication of GPU use then GPU load indicator itself). I have not experimented with CPU time usage limitation, so far it has been always set to 100%...
ID: 1810302 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1810511 - Posted: 19 Aug 2016, 21:59:30 UTC
Last modified: 19 Aug 2016, 22:01:50 UTC

Gupies take like 5 min when running one at a time on CUDA GTX1080.
http://setiathome.berkeley.edu/results.php?hostid=7475713&offset=0&show_names=0&state=4&appid=

Be patient. The GPU you have has the power to run them at 3-4x speed. The software is just not ready for that. And I'm sure there is still a lot to improve.

petri.

EDIT: Hardware added
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1810511 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1810657 - Posted: 20 Aug 2016, 6:48:20 UTC - in response to Message 1810511.  

Yes, I see that from your WUs, obviously there is lot of room for improvement.

I am sure that you, Richard, Raistmer and others are putting efforts to improve applications to use this "spare" GPU potential...
ID: 1810657 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1812451 - Posted: 25 Aug 2016, 11:46:13 UTC

Thanks to petri33 who send me nicely its own binaries, my GTX 1080 rocks now !
It takes around 4 minutes to complete a wu (blc or classic :)

Thanks for your help :)
ID: 1812451 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Low RAC with GTX 1080


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.