Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 83 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1864882 - Posted: 30 Apr 2017, 22:36:46 UTC - in response to Message 1864875.  
Last modified: 30 Apr 2017, 22:37:58 UTC

That core2duo would be a great candidate for Linux. Drop the CPU tasks and get 30-32k out of those 750Ti's :D

I've considered it many times. Even doing no CPU crunching with the SoG application the system still kept falling over.
The problem is booting from the USB port- it'll find the thumb drive with the LINUX install/setup image, but then can't boot it; yet the other system can find it & boot from it. Add to that it sometimes loses the network port, as well as a couple of the HDD ports. It's copped quite a few power surges over the years and has a few quirks as a result.

When I get the funds to replace my other system, the C2D will be retired and the current i7 will become the backup/emergency use system. And even it has issues due to power surges over the years (network port was burnt out by a lightning strike through phone line- modem- network lead), and it can take 10 minutes to boot up from a cold start, hence I never shut it down. (Always wonder if it will restart after a power outage).
Grant
Darwin NT
ID: 1864882 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1864890 - Posted: 30 Apr 2017, 22:59:09 UTC - in response to Message 1864882.  

Many times you can only boot from certain USB ports, usually the closest to the PS2 ports.
I hear you on the boot times. My AMD4200 is lucky to reboot in under 15 minutes in Windows. With Linux it's a little over a minute. My i7 is about 11s after the BIOS :)
ID: 1864890 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1864900 - Posted: 30 Apr 2017, 23:21:59 UTC - in response to Message 1864890.  

Many times you can only boot from certain USB ports, usually the closest to the PS2 ports.

Front, back, side (well not side..) it'll find the thumb drive, but it just won't boot from it.

I hear you on the boot times. My AMD4200 is lucky to reboot in under 15 minutes in Windows. With Linux it's a little over a minute. My i7 is about 11s after the BIOS :)

This one is purely some hardware issue.
The power supply will come on, fans will power up, motherboard LEDs light up, a few seconds later you'll get the Beep, then it shuts down, and does it all over again. Sometime once or twice, some times a couple of dozen. Then eventually it will get to a point where it beeps, then the BIOS screen comes up & it continues on it's merry way.
I though it might have been related to the power_good signal from the PSU- but no different with another PSU. So I figure it's just had one power surge too many.
As you say, once it gets past the BIOS to the Loading OS message, it's a matter of seconds. Got to love SSDs, even "old" ones.
Grant
Darwin NT
ID: 1864900 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864921 - Posted: 1 May 2017, 0:58:40 UTC - in response to Message 1864835.  
Last modified: 1 May 2017, 1:07:32 UTC


I haven't gotten to the point yet of comparing the performance of the two cards vs. the SoG app in Windows. Perhaps this evening I'll get to that. The 960 is indeed on a riser cable, but from an x8/x4 slot rather than x1. According to the NVIDIA info display, the Memory Transfer Rate is the same for both cards, and both are running just slightly under their maximum clock rates, with the 780 showing at the P3 level and the 960 at P2 (with the same clock range as P3). I suppose the fact that the monitor is connected to the 960 could cause a slight performance degradation, but I don't really know what, if any, the impact might be.

I don't know if I want to consider replacing the 670 or not. It's certainly a productive workhorse on the Windows side and it seems a shame that the Special App can't accommodate the GTX 6xx series of cards. I've still got a GTX 660 in one of my other crunch-only hosts (along with two 750Ti's and a 960). I'll probably evaluate the whole situation after running with just the 2 cards on the Special App for a few weeks and see if the added throughput on those manage to overcome the loss of the 670. If not, then I'd probably go back to running SoG in Windows, rather than replace the 670.


. . Hi,

. . Perhaps a heart/lung (GPU) tranplant might do the job if the machines are suitable donor/recipients. Move the 6xx series cards into the Windows based rigs and the 7xx and 9xx series cards into the Linux rigs. Hopefully you will manage to get the best of both worlds then. As TBar said, the 750ti should put out some good numbers using CUDA80. I am running a single 1050ti in my Core2 Duo and that is hitting the 25K mark. I am very sure that the 750ti alone would go over 15K and 19K could be possible. Actually your Core2 Duo with the one 750ti would be an excellent candidate for a Linux/CUDA80 refit. Don't bother with CPU crunching, turn off -bs (sorry TBar) and let the 750ti strut it's stuff. I feel sure the numbers will blow you away. It would probably do something like 200 tasks a day or close to it.

Stephen

:)
ID: 1864921 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864930 - Posted: 1 May 2017, 1:28:13 UTC - in response to Message 1864860.  

Hi fellows,
The 780 is doing admirably well. It has 12 cores.

All new cards (750+) do good too.

I had a sad feeling with my 980 series. They did not perform that well. The 780 outperformed the 980 on shorties when benchmarking. The wattage of the 780 is justified anyway. It can do. I never had a chance to try with a 780Ti. I guess it would have done well. On lower AR (like vlar) the 980 did well.

a) you need a lot of GPU cores (sm/smx) -- this is where my optimisation is aimed at. The ATI/AMD have 64 of those.. My Ti has 28.
b) any new GPU can do more instructions per clock. (Maxwell/Pascal)
c) a fast GPU RAM helps. -- Lower GPU clock, higher RAM? temp v.s. speed.

--
Petri


. . Hi Petri,

. . I have found that too, with my 970s running SoG they do much better on the Guppis then on the Arecibo work. When running doubles, halflings (shorties) take 7 to 8 mins (4 mins each), Normal AR Arecibos (NARAs) take 13 to 14 mins(7) but Guppis (Blc02) take only 16 to 18 mins (9). That is closest to matching runtimes as any of my GPUs can manage. While on the Core2 Duo running CUDA80 my 1050ti is great with Arecibo work, Halflings take 2 mins, NARAs take 4.5 mins and Guppis take 7.5 mins, still a big difference.

Stephen

..
ID: 1864930 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864934 - Posted: 1 May 2017, 1:44:34 UTC - in response to Message 1864872.  

Under 32-bit Win7, I seem to be stretching the RAM limits due to some video RAM mapping that I've never been able to sort out. It's running right on the ragged edge with the current configuration.

I have a 32bit Win Vista system with a C2D and 2* GTX 750Tis. I am unable to run the SoG application on it due to video driver restarts. The lack of available physical RAM (due to the 32bit OS), slow CPU clock speed & limited number of cores just make it impossible for the system to meet the demands of even those low power cards.

I don't know what might be inhibiting the 960.

CPU clock speed?
The faster the GPU crunches, the more CPU support it needs. Even with a CPU reserved per GPU WU, if the CPU is too slow to keep up, then the GPU output will be significantly impacted.
I can't see PCIe bandwidth having that great an impact on crunching times unless the systems are only PCIe v1 specification. Even then, such a big performance hit seems, unlikely... a 50% hit, maybe. But 2-3 times?


. . I can pretty much rule out the PCIe bus being the limitation. I am running 2 x GTX1060-6GB on my Pentium-d with a Gen 1.1 bus. Originally I had left the PCIe config at default which meant the 1st card was x16 and the second was x1. There was a difference of only a few seconds in their runtimes. I have corrected that and they are now both x8, and their runtimes are pretty much identical. But you can't go back much lower than that, a Gen 1.1 bus at x1, and yet the 1060 was doing very well.

. . On the other hand, how much ram is in your C2D Grant? Have you considered running Linux 64 bit with CUDA80? It would become a 30K machine at least :). All you need to give it a try is a $10 flashdrive and the inclination. :)

Stephen

:)
ID: 1864934 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864935 - Posted: 1 May 2017, 1:47:30 UTC - in response to Message 1864875.  

That core2duo would be a great candidate for Linux. Drop the CPU tasks and get 30-32k out of those 750Ti's :D


. . My thoughts exactly Brent!

Stephen

:)
ID: 1864935 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864937 - Posted: 1 May 2017, 1:52:44 UTC - in response to Message 1864882.  

That core2duo would be a great candidate for Linux. Drop the CPU tasks and get 30-32k out of those 750Ti's :D

I've considered it many times. Even doing no CPU crunching with the SoG application the system still kept falling over.
The problem is booting from the USB port- it'll find the thumb drive with the LINUX install/setup image, but then can't boot it; yet the other system can find it & boot from it. Add to that it sometimes loses the network port, as well as a couple of the HDD ports. It's copped quite a few power surges over the years and has a few quirks as a result.

When I get the funds to replace my other system, the C2D will be retired and the current i7 will become the backup/emergency use system. And even it has issues due to power surges over the years (network port was burnt out by a lightning strike through phone line- modem- network lead), and it can take 10 minutes to boot up from a cold start, hence I never shut it down. (Always wonder if it will restart after a power outage).


. . I guess that means you would have to bite the bullet and create a Linux Live install DVD then reformat and install to your HDD. That is a big step, but it could be the way to go.

Stephen

??
ID: 1864937 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864939 - Posted: 1 May 2017, 1:55:43 UTC - in response to Message 1864900.  

Many times you can only boot from certain USB ports, usually the closest to the PS2 ports.

Front, back, side (well not side..) it'll find the thumb drive, but it just won't boot from it.

I hear you on the boot times. My AMD4200 is lucky to reboot in under 15 minutes in Windows. With Linux it's a little over a minute. My i7 is about 11s after the BIOS :)

This one is purely some hardware issue.
The power supply will come on, fans will power up, motherboard LEDs light up, a few seconds later you'll get the Beep, then it shuts down, and does it all over again. Sometime once or twice, some times a couple of dozen. Then eventually it will get to a point where it beeps, then the BIOS screen comes up & it continues on it's merry way.
I though it might have been related to the power_good signal from the PSU- but no different with another PSU. So I figure it's just had one power surge too many.
As you say, once it gets past the BIOS to the Loading OS message, it's a matter of seconds. Got to love SSDs, even "old" ones.


. . I realise this is probably a redundant question but have you reset the BIOS to defaults?

Stephen

??
ID: 1864939 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1864941 - Posted: 1 May 2017, 1:58:38 UTC - in response to Message 1864937.  

. . I guess that means you would have to bite the bullet and create a Linux Live install DVD then reformat and install to your HDD. That is a big step, but it could be the way to go.

No DVD unit.
And given the state and age of the system, not really worth spending any money on.
Grant
Darwin NT
ID: 1864941 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864959 - Posted: 1 May 2017, 3:46:34 UTC - in response to Message 1864941.  

. . I guess that means you would have to bite the bullet and create a Linux Live install DVD then reformat and install to your HDD. That is a big step, but it could be the way to go.

No DVD unit.
And given the state and age of the system, not really worth spending any money on.


. . A shame about the problems but everything has a life span ....

. . Well I hope it lasts long enough for your upgrade to go smoothly.

Stephen

..
ID: 1864959 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1864962 - Posted: 1 May 2017, 4:00:18 UTC - in response to Message 1864959.  

Stephen, I really wish you would read a thread completely before commenting on things that have already been discussed and closure given.
And PLEASE lay off the "Quote" Button!
ID: 1864962 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22204
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1864988 - Posted: 1 May 2017, 7:43:38 UTC - in response to Message 1864900.  

Grant - Your struggle to boot problem may be a dead (or almost dead) battery needing a bit o a kick so the BIOS can read enough of the startup data to get going.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1864988 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1864991 - Posted: 1 May 2017, 7:59:15 UTC - in response to Message 1864988.  
Last modified: 1 May 2017, 8:01:47 UTC

Grant - Your struggle to boot problem may be a dead (or almost dead) battery needing a bit o a kick so the BIOS can read enough of the startup data to get going.

Will give that a check.
So far the date & time have been correct each time it has restarted from being powered down.


EDIT- will wait for the next power outage. Have to pull one of the video cards to get to the CMOS battery.
Grant
Darwin NT
ID: 1864991 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1864999 - Posted: 1 May 2017, 9:17:09 UTC - in response to Message 1864962.  

. . After I read the other replies I tried to edit mine but it was too late, I got into the edit mode but the updated version was rejected as out of time ... :(

. . Sorry about that.
ID: 1864999 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1865045 - Posted: 1 May 2017, 16:24:42 UTC - in response to Message 1864874.  
Last modified: 1 May 2017, 16:27:23 UTC

Hi fellows,
The 780 is doing admirably well. It has 12 cores.
--
Petri
The problem I ran into seemed to be some sort of incompatibility between the 780, TBar's Cuda 8.0 version of the Special App, and a small percentage of processed tasks (all but one being guppi VLARs). It did not occur with my GTX 960 using Cuda 8.0 and it is not occurring now running the Cuda 6.5 version on the 780.

I detailed the problem in my Message 1864468 and my followup in Message 1864585. You can still see examples of the Stderr output in any of the Invalid tasks for host 8253697 and in most of the Inconclusive tasks for that host that were reported on 28 Apr.....It would be interesting to see if anybody else with a 780 experiences the same issue with the Cuda 8.0 app or if this is somehow unique to my setup.
I think you'll find the older Kepler GPUs just don't work well with the newer CUDA versions. Since there aren't many newer CUDA Apps around here, you might have problems finding examples other than the Linux Apps. There is a Mac CUDA 7.5 App on Main, and as I stated, I have seen some of the older GPUs have problems with it. If you wish to try another Kepler GPU you might try your 630 GT which appears to be a supported model; Device 1: GeForce GT 630, 2048 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 2
My guess is it will have problems with the CUDA 8 App, but might work with the CUDA 6.5 App. Of course, you might have to install another Linux system, or move it to the existing system to try it.
If it were me, I'd yank that 670 out, place the 960 in a slot, and hang the 630 on the extender. That would exterminate two problems with one move. But, that's just me. There aren't many people around with supported Kepler cards, so, it would appear it's your move ;-)
ID: 1865045 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1865055 - Posted: 1 May 2017, 17:06:44 UTC - in response to Message 1865045.  

I think you'll find the older Kepler GPUs just don't work well with the newer CUDA versions. Since there aren't many newer CUDA Apps around here, you might have problems finding examples other than the Linux Apps. There is a Mac CUDA 7.5 App on Main, and as I stated, I have seen some of the older GPUs have problems with it. If you wish to try another Kepler GPU you might try your 630 GT which appears to be a supported model; Device 1: GeForce GT 630, 2048 MiB, regsPerBlock 65536 computeCap 3.5, multiProcs 2
My guess is it will have problems with the CUDA 8 App, but might work with the CUDA 6.5 App. Of course, you might have to install another Linux system, or move it to the existing system to try it.
If it were me, I'd yank that 670 out, place the 960 in a slot, and hang the 630 on the extender. That would exterminate two problems with one move. But, that's just me. There aren't many people around with supported Kepler cards, so, it would appear it's your move ;-)
That GT 630 resides here on my daily driver, where crunching optimization tends to be detrimental to all the other things I expect this box to do. Definitely NOT converting it to Linux, either! ;^)

However, I do have another GT 630 that I used to run in a different box that I took out of service about a year ago. I could probably dig it out and try it, but that one only has 1GB of RAM, so it might not accomplish much.

In any event, I'm still pulling performance numbers on the 960, but what I've seen so far doesn't look encouraging. It appears that overall throughput for normal and high AR tasks is actually worse with the Special App than it is with SoG in Windows (running 2 per GPU). There is a small gain for VLARs with the Special App. Hopefully, I'll have more specific numbers a little later. I think I'll also try the same exercise for the 780.
ID: 1865055 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1865063 - Posted: 1 May 2017, 17:31:47 UTC - in response to Message 1865055.  
Last modified: 1 May 2017, 18:20:42 UTC

Well, the 630 has to be the Newer GT 630 v2 with CC 3.5. The older 630 is probably the Fermi version CC 2.1, which won't work with the Special App, https://en.wikipedia.org/wiki/CUDA#GPUs_supported
Here is another 960 running the CUDA 8 version of zi3t2b, as you can see His times are even faster, blc02 = Run time: 6 min 29 sec So....I'd say it's pretty obvious there is something wrong with your 960 setup....probably the riser setup.

BTW, Anyone running an Older version of the Special App should Upgrade to the current zi3t2b version. Not only does it produce fewer Inconclusive results, it is much faster on the BLC tasks than the early versions.
Linux CUDA Special Apps
ID: 1865063 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1865069 - Posted: 1 May 2017, 18:17:45 UTC - in response to Message 1865063.  
Last modified: 1 May 2017, 18:31:41 UTC

Well, the 630 has to be the Newer GT 630 v2 with CC 3.5. The older 630 is probably the Fermi version CC 2.1, which won't work with the Special App, https://en.wikipedia.org/wiki/CUDA#GPUs_supported
Here is another 960 running the CUDA 8 version of zi3t2b, as you can see His times are even faster, blc02 = Run time: 6 min 29 sec So....I'd say it's pretty obvious there is something wrong with your 960 setup....probably the riser setup.
Yeah, it's the Kepler version, passively cooled, but with only 1GB RAM.

Okay, I pulled run times on the 960 for High, Normal and Low ARs, 6 tasks for each range and then averaged them. I've compared times for the Linux "Special" and the Windows 8.1 "SoG", both running on the same box with the same connection for the 960. Since I run 2 tasks per GPU on SoG, I also calculated an estimated total throughput as "Tasks per Hour" to get a more accurate comparison between the Special App and SoG than the Run Time alone would provide. And, just as a "control", I also pulled run times for similar GTX 960s that are running on another box and have almost the same clock rate as the one in question. Here's what I got:
                 Host 8253697    |    Host 7057115      |    Host 8064262
                Linux "Special"  | Win8.1 "SoG" (2/GPU) | Win7 "SoG" (2/GPU)
               Avg RT (Tasks/Hr) |  Avg RT (Tasks/Hr)   | Avg RT (Tasks/Hr)
              -------------------|----------------------|-------------------                   
High AR -----     7:10 (8.37)    |     10:33 (11.37)    |    10:52 (11.04)
Normal AR ---    14:11 (4.23)    |     19:36 (6.12)     |    19:27 (6.17)
VLAR --------    13:55 (4.31)    |     32:01 (3.74)     |    29:57 (4.00)

Only on VLARs does the Special App seem to provide better throughput. And it seems to me that if the riser connection was a problem for the 960, it would also show up in degraded run times when compared to the SoG times for 960s in the other box, whereas those times are quite similar. Does it seem likely that Linux would have a problem with a GPU on a riser, when Windows does not? I dunno.

So, at the moment, those results leave me very perplexed. When I can get to it, perhaps this evening, I'll try doing a similar comparison for the run times on the 780, although I don't have one in another box to use as a control. I'm certainly curious to see if the Special App significantly improves the throughput for the 780.

EDIT: Hmmm....I see in the output for that last example you provided that he's apparently using a "pfb" parameter, which I assume is the same as the "pfblockspersm" that I used to set in the mbcuda.cfg file when I was running Cuda in the pre-SoG days. Could something like that be causing such an improvement?
ID: 1865069 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1865071 - Posted: 1 May 2017, 18:33:56 UTC - in response to Message 1865069.  

Yet the Top Computer list is Filled with 'Special' Hosts running over Windows machines which appear to have greater capacity, https://setiathome.berkeley.edu/top_hosts.php
It shouldn't be too difficult to figure out...
;-)
ID: 1865071 · Report as offensive
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.